Clemson University TigerPrints All Dissertations Dissertations 12-2014 An Approximation Algorithm for the Stable Marriagae Problem with Ties and Incomplete Lists Rommel Jalasutram Clemson University Follow this and additional works at: http://tigerprints.clemson.edu/all_dissertations Part of the Computer Sciences Commons Recommended Citation Jalasutram, Rommel, "An Approximation Algorithm for the Stable Marriagae Problem with Ties and Incomplete Lists" (2014). All Dissertations. Paper 1615. This Dissertation is brought to you for free and open access by the Dissertations at TigerPrints. It has been accepted for inclusion in All Dissertations by an authorized administrator of TigerPrints. For more information, please contact [email protected]. An Approximation Algorithm for the Stable Marriage Problem with Ties and Incomplete Lists A Dissertation Presented to the Graduate School of Clemson University In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy Computer Science by Rommel Jalasutram December 2014 Accepted by: Dr. Brian C. Dean, Committee Chair Dr. Amy Apon Dr. Ilya Safro Dr. Pradip K. Srimani Abstract Consider the bipartite matching problem with two sets of participants: men (L) and women (R). Each person participating has a strict and complete preference list over participants from the other set. The goal is to pair men and women such that no one can improve their partner by breaking away from the centralized matching scheme. Problems that exhibit this flavor are commonly classified as ordinal matchings. Gale and Shapley showed how to obtain such a matching (otherwise known as a “stable” matching). Generalizations of this model have found relevance in several centralized matching schemes such as the National Residency Matching Program where graduating doctors are matched to hospitals, human organ transplant exchange markets, and housing allocation markets. In this thesis, we study a generalization of the stable matching problem, where the preference lists of participants are allowed to contain ties and missing entries. Here, we seek to find a stable matching of maximum cardinality. The problem was shown to be NP-hard, and is one of the most prominent problems in the domain of ordinal matching. When preference lists are restricted to one side of the problem, Iwama et al. [28] devised a variant of the famous Gale-Shapley stable matching algorithm that breaks ties using edge weights obtained by solving a linear programming relaxation of the problem, leading to an approximation ratio of 25/17 (approximately 1.47058). We apply ideas from factor-revealing LPs to show that this analysis in [28] is an upper bound, but that the algorithm and analysis can be improved to yield an approximation ratio of at most 19/13 (approximately 1.461538), improving the best currently-known approximation ratio (obtained via different techniques in [39]) of 41/28 (approximately 1.4643). ii Dedicated to my father, (late) Dr. Jalasutram Muralidhar iii Acknowledgments My stay in graduate school has been longer than I had imagined. This very day seven years ago, I left my country to pursue higher studies. It has turned out to be a rather interesting journey, and many people I have come to know are responsible for it. First, I would like to thank my advisor Dr. Brian Dean for accepting me as his student. I consider myself to have been extremely lucky to have him as my advisor. Dr. Dean has encouraged me at every step, was extremely patient with me and was generous with his time, ideas, and gave me the freedom to choose my topic of interest. Without his optimism, I would still be stuck in a cave trying to find my way out. His motto to keep things simple: “Mother Nature never intended it to be this complicated”, has helped in simplifying the results presented in this dissertation. Working with Dr. Dean has been a great pleasure and an honor. I would like to thank Dr. Amy Apon, Dr. Ilya Safro, and Dr. Pradip Srimani for being on my committee. Next, I am grateful to Dexter Stowers and Dr. Wayne Madison for helping me secure funding during my first year of PhD studies. Without their help, I could not have started this journey. Working in the Applied Algorithms Lab has been a fun filled experience. I would like to thank my group members Raghuveer Mohan, Chad Waters and Matt Dabney for making sure I had a great time throughout. Over the years, I have made really amazing friends who have stood by me throughout and made my stay at Clemson memorable. Of the top of my head, I wish to acknowledge Achal Singhal, Biswa Singh, Sandeep Lokala, Kalaivani Sundararajan, Sumod Mohan, Dhananjay Joshi, Kirti Kanitkar, Shivkumar Morkhande, Sahasranshu Panda, Keya Sharma, Biswajit Mazumdar, Uttara Thakre, Rachana Ranade and many others for being there. Finally, it goes without saying that I owe everything I have achieved to my parents, (late) Dr. Jalasutram Muralidhar and Jalasutram Durga, and my younger brother Kartik. They made sure I had a happy childhood which continues to this day. My parents-in-law, Dr. Srikanth Guruswamy iv and Kanakadhara Srikanth, and my sister-in-law Abhinaya have given me a second family in the last few years. They have been equally warm and loving as my own family. Last but not the least, my wife, Ahalya, has been my pillar of strength, who put up with my ups and downs and made sure I was well taken care of. This research has been supported in part by Dr. Dean’s NSF career award CCF-0845593. v Table of Contents Title Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii 1 Introduction . . . . . . . . . . . . . 1.1 Stable Matchings . . . . . . . . . 1.2 Stable Matching Generalizations 1.3 Main Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 4 6 2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . 3.1 Symmetric Difference and Augmenting Paths . . . . 3.2 Linear Programming Relaxation . . . . . . . . . . . 3.3 Stable Allocation Relaxation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 10 13 16 4 Counterexamples and Lessons Learned . . . . . . . 4.1 Iterative and Randomized Rounding . . . . . . . . . 4.2 Deterministic “Primal” LP Rounding . . . . . . . . . 4.3 Stable Allocation Variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 21 24 25 5 Factor-Revealing LP and 19 13 Approximation . . . . 5.1 GS-LP Algorithm . . . . . . . . . . . . . . . . . . . . 5.2 Longer Augmenting Paths and Validity . . . . . . . 5.3 Towards a Factor-Revealing LP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 28 32 37 6 Conclusions and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 vi List of Tables 5.1 5.2 Different types of R edges. These edges come into existence due to GS-LP. Columns 2 and 3 represent the corresponding preference lists of man m1 and woman w1 . . . . Approximation Ratios for LPf (d) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 38 44 List of Figures 1.1 1.2 1.3 3.1 3.2 3.3 3.4 3.5 A maximum matching that is unstable. Preference lists are shown alongside each individual, with partners shaded. People on the left in a preference list are more preferred. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A maximum matching that is stable. . . . . . . . . . . . . . . . . . . . . . . . . . . . Maximum vs maximal stable matchings. . . . . . . . . . . . . . . . . . . . . . . . . . Maximum vs maximal stable matchings. . . . . . . . . . . . . . . . . . . . . . . . . . Augmenting 7-paths in 2 different configurations. Ties are indicated with parentheses. Eliminating a cycle by spinning flow. At each node we increment along an incoming edge and decrement along an outgoing edge or vice-versa. Change in allocation at each node is 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Feasible allocation x∗ for edge (mi , wj ) and 0 < x∗ij < 1. . . . . . . . . . . . . . . . . Path of length 5 in the half-integral auxiliary graph G’. Becomes and augmenting path if diagonal edges are chosen in the matching. . . . . . . . . . . . . . . . . . . . 2 3 5 11 12 18 19 20 4.1 4.2 4.3 LP and IP solutions to the instance I2 . . . . . . . . . . . . . . . . . . . . . . . . . . . Instance where allocation to men in ties is equal. The fractional edges form a cycle. An instance where continuous rounding fails . . . . . . . . . . . . . . . . . . . . . . . 22 25 26 5.1 5.2 5.3 5.4 5.5 Augmenting 3-path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Two examples of augmenting 5-paths eliminated due to GS-LP algorithm. . . . . . . The unique configuration of an augmenting 5-path. . . . . . . . . . . . . . . . . . . . Augmenting 7-paths due to 2 different configurations . . . . . . . . . . . . . . . . . . Different types of preference lists. G and O indicate the GS-LP and optimal edges. T indicates a tie, and (p) indicates that the particular person has high priority at the end of GS-LP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Invalid Augmenting 5-path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Edge belonging to M ∆MOP T . | | indicates a tied region. . . . . . . . . . . . . . . . . 31 31 32 33 5.6 5.7 viii 35 36 41 Chapter 1 Introduction Matching is a well-studied problem that finds relevance in several real world scenarios. A set of edges, in which no two share a common node represents a matching. Generally we pair nodes based on some criterion. Matchings can be categorized into two subclasses. The first subclass explores the optimization side of the problem, the objective being to obtain matchings of minimum cost or maximum value. For example, given a set of jobs and machines, and a cost function for all the job-machine pairs, our goal might be to obtain a matching of all jobs to machines at the lowest cost possible. The second subclass models problems that exhibit a game-theoretic flavor. Human organ transplant exchange markets, housing allocation, and stable matchings are some of the more popular examples. Problems from this subclass are also referred to as ordinal matching problems, since they involve ranked preference lists rather than explicit costs. Ordinal matching problems are common in situations where we are unable to assess the quality of the solution in terms of numeric cost. In this thesis we study ordinal matching problems on bipartite graphs, the focus being stable matchings. Interest in stable matchings was sparked in the early 50’s by the initial question of how to better match students to colleges. Wide applicability and relevance has contributed to an increase in attention to this area of study from mathematicians, computer scientists, and economists alike in the last few decades. Countries such as the USA, Canada, Scotland, and Japan rely on centralized matching schemes to match graduating medical students to hospitals based on the preferences of students and hospitals for each other [1, 2, 3, 18, 40]. Other relevant applications include matching students to schools [4, 5, 44] and universities [9]. In each of these examples, the participants submit 1 w2 w1 m2 w2 m2 m1 m1 m2 w2 w1 m1 w1 Blocking pair (m2 , w2 ) Maximum matching M = {(m2 , w1 ), (m1 , w2 )} Figure 1.1: A maximum matching that is unstable. Preference lists are shown alongside each individual, with partners shaded. People on the left in a preference list are more preferred. their preferences to a central matching agency. Thus, stable matching is a popular model for ordinal centralized matching in practice. Stable matching belongs to an area of research that has come to be known as algorithmic game theory. The last few years has seen an explosive growth in research being conducted in this area due to its ability to capture various aspects of practical problems. The Nobel prize in economics for the year 2012 was awarded to Alvin E. Roth and Lloyd S. Shapley for their work on “stable allocations and practice of market design”. Game theoretically speaking, we would like to design a matching mechanism which creates a conducive environment such that given a matching, none of the participants involved have any incentive to break away and trade among themselves in order to obtain a better match. A matching achieving this objective is deemed stable. 1.1 Stable Matchings Stable matchings are represented by a bipartite graph G = (V = L ∪ R, E). Conventionally, the sets L and R represent the set of men and women participating in the matching. Each person has a totally ordered (complete) preference list over all the persons from the other set. The set of edges E represents the acceptable pairs, defined by the preference lists of men and women. We define M (p) to be the partner of person p in the matching M . For an edge (m, w) ∈ E, where m ∈ L and w ∈ R, woman w is present in man m’s preference list and vice-versa. For some matching M ⊆ E, edge (m, w) ∈ E \ M is a blocking pair if either m is unmatched or prefers w to his partner M (m), and similarly w is unmatched or prefers m to her partner M (w). Simply put, both of them have 2 w2 w1 m2 w2 m2 m1 m1 m2 w2 w1 m1 w1 Stable matching M = {(m1 , w1 ), (m2 , w2 )} Figure 1.2: A maximum matching that is stable. worse partners and would be better off matched with each other. Thus, such pairs tend to gain by breaking away from the matching scheme as there exist possibilities to obtain better partners. Any matching M is said to be stable if there exist no blocking pairs. Maximum cardinality matchings need not necessarily be stable. This is illustrated by Figures 1.1 and 1.2. The size of the maximum matching for the instance is two. Figure 1.1 is a maximum matching (indicated by dashed lines) that is unstable due to the presence of blocking pair (m2 , w2 ), while Figure 1.2 is a maximum stable matching. 1.1.1 The Gale-Shapley Algorithm Gale and Shapley [14] in a seminal paper, showed that every instance admits a stable matching, has a cardinality of min(|L|, |R|), and proposed an O(n2 ) algorithm commonly known as the Gale-Shapley (GS) algorithm to obtain such a matching. Each man has a preference list over all the participating women and vice-versa. The preferences are strict and list persons from the most desired to the least desired. The algorithm consists of a series of proposals by single men to acceptable women. In the worst case, the running time is O(|E|), as we might have to go through the entire preference list of all the men. The idea behind the algorithm is simple. All men start out single. An arbitrary single man m is chosen to propose. Man m proposes to the most desirable woman w on his list he has not proposed to yet. On receiving a proposal, woman w has two choices: accept or reject. She accepts the proposal if single or m is a strictly better match than her current partner, and rejects otherwise. Once matched, a woman stays matched as any accepted proposal is only from a strictly better suitor. Each rejection causes a man to move down his list and consider lesser desired woman, while each acceptance enables a woman to move up her list and be matched 3 with more desired men. A stable matching thus obtained is man-optimal and woman-pessimal. An interesting observation is that running the GS algorithm could give us different stable matchings, but the same set of people always end up matched. Choosing an arbitrary single man for proposing seems to introduce a level of indeterminism. However, from [14] we see that such indeterminism is inconsequential, as termination of the GS algorithm guarantees a stable matching. 1.2 Stable Matching Generalizations Owing to the number of participants in large matching schemes, it is unrealistic to assume that each participant has a complete preference list. For example, approximately 30,000 medical school graduates take part in a centralized matching scheme; the National Residency Matching Program (NRMP), which matches them to hospitals. It would be unrealistic to expect a hospital to rank all of the applicants. Thus, we look at the three possible generalizations based on modifications to the preference lists of the participants. Stable Matching with Incomplete Preference Lists. We allow incomplete lists where each man ranks a subset of the participating women and vice-versa. Here, a blocking pair also includes pairs (m, w) ∈ E \ M such that man m is unmatched and woman w prefers man m to her partner M (w) or vice-versa. Every instance to the new problem admits a stable matching, which can be found using the GS algorithm [18]. The matching thus obtained might not match everyone participating, but it can be shown that in every stable matching, the same set of men and women end up matched [13]. Thus, the size of every stable matching is the same. Stable Matching with Ties. A similar generalization to the problem is to allow preference lists to contain ties (indifference to choices). This commonly occurs in practice as people might have equal preferences for certain choices (e.g. one prefers vanilla and chocolate flavors equally). The definition of stability in this case changes slightly. Three classes of stability have been studied in the literature, namely weak, strong and super. In super stability, a pair (m, w) ∈ E \ M is blocking if for man m, woman w is equally or more preferred by m than his current match or vice-versa. In strong stability, a pair (m, w) ∈ E \ M is blocking if man m strictly prefers woman w to his current match and woman w prefers man m equally or more than her current match. In weak stability, a 4 Tie m1 m2 w2 m2 w2 w2 w1 m1 m1 w1 Maximum stable matching Maximal stable matching Figure 1.3: Maximum vs maximal stable matchings. pair (m, w) ∈ E \ M is considered blocking if man m strictly prefers woman w to his match and vice-versa. It is useful to note that any super stable matching is strongly stable, and any strongly stable matching is weakly stable. In case of super and strong stability, there exist instances for which no stable matching exists. Nevertheless, algorithms exist which can in polynomial time determine if an instance admits a matching that is super or strongly stable [21, 27]. A weakly stable matching always exists and can be found in polynomial time [21]. Due to this reason it seems to be by far the more popular of the three classes. Thus, we consider instances under weak stability in this thesis. Here, we can use the GS algorithm, breaking ties arbitrarily and still obtain stable matchings with cardinality min(|L|, |R|). Stable Matching with Ties and Incomplete Lists (SMTI). Following the same line of thought, one could ask what happens if both ties and incomplete lists are allowed. The notation indicates a strict preference between two choices under consideration, while indicates strict preference or indifference. To re-state the definition of weak stability formally, a pair (m, w) ∈ E \ M is said to be blocking if the following constraints hold: (i) M (m) 6= w, (ii) w is single or m w M (w), and (iii) m is single or w m M (m). Unlike the previous two generalizations, one can obtain stable matchings with varying cardinalities here. Figure 3.1 is one such example. Here, the size of the maximum cardinality stable matching M = {(m1 , w1 ), (m2 , w2 )} is two and is indicated by solid edges. Any matching is said to be maximal if we cannot add any further edges to the already existing matching. In Figure 3.1, the maximal stable matching M = (m1 , w2 ) has cardinality one and is indicated by a dashed edge. Given this possibility, we would like to obtain stable matchings of maximum cardinality 5 (MAX SMTI). Unfortunately, the MAX SMTI problem was shown to be NP-hard [24, 34]. In the special case where ties are restricted one side (women), the problem still remains NP-hard. Definition 1. An α-approximation algorithm for an optimization problem is a polynomial-time algorithm that obtains a ratio of at most α between an optimal solution and any solution given by the algorithm over all the instances of the problem. α is greater than 1 for a maximization problem. A stable matching, being maximal, provides a matching that is at least 1 2 the size of an optimal matching. In other words, an optimal matching has at most twice as many edges as in a maximal matching. Running the GS algorithm on an SMTI instance gives a maximal stable matching. Thus we can observe that the GS algorithm gives a 2-approximation to the MAX SMTI problem. 1.3 Main Contributions In this dissertation, we consider the MAX SMTI problem with ties restricted to one side. The current best known approximation factor is result to obtain an approximation ratio of 1.461538 ≈ 41 28 19 13 . ≈ 1.464285 [39]. We improve upon this In order to obtain the improved result, our algorithm uses information from the linear programming (LP) relaxation of the interger programming (IP) formulation of the SMTI problem to break ties. We analyze the algorithm using another LP, commonly known as a factor revealing LP. Here we look at the residual structure of the matching obtained by the algorithm and rewrite the structure in terms of LP constraints. Solving the new LP gives the necessary approximation ratio. In Chapter 2, we survey the literature and present the current known results. Chapter 3 discusses the preliminaries necessary to understand and analyze the problem. In Chapter 4, we look at common approximation techniques and the reasons they might be inapplicable to the MAX SMTI problem. In Chapter 5, we describe the algorithm and lay the foundations necessary to develop a factor-revealing LP and use it to obtain an approximation ratio 1.461538, thereby improving upon the current best known ratio. We conclude in Chapter 6 with open questions due to this work. 6 Chapter 2 Background The MAX SMTI problem has received substantial attention in the literature. It was shown to be NP-hard [24, 34]. In the last few years, there have been a sequence of improvements over the trivial 2-approximation obtained with the GS algorithm. Initial improvements were obtained using local search techniques [25, 27]. The approximation ratio here approaches 2 as |V | tends to infinity. The factor was further improved via clever enhancements to 15 8 [26]. Kiraly [30] introduced the idea of promotions to break ties and further improved the approximation ratio to 5 3. He used a modification of the GS algorithm such that each man is now permitted to propose to the women on his list more than once, each time with higher “priority”. McDermid [35] improved this to 3 2 by exploiting a classical graph theoretic result on maximum bipartite matching known as Gallai 3 Edmonds decomposition. The runtime complexity of this algorithm is O |V | 2 |E| . Currently, this is the best known result for the general SMTI problem where ties are allowed in the preference lists of both men and women. Paluch [38] and Király [31] gave linear time algorithms while obtaining the same ratio. The only known approximation ratio better than 3 2 for the general case is 10 7 . Huang et al. [20] obtained this bound for the special case where the length of each tie is restricted to 2 and ties are allowed on both sides. In the special case where ties are restricted to one side (i.e., ties allowed in the preference lists of either men or women but not both), the problem continues to be NP-hard. For this case, Iwama et al. [28] succeeded in improving the approximation ratio to 25 17 ≈ 1.47058. They formulated the maximum cardinality stable matching problem as an integer program, and relied on the fractional values obtained by solving its linear programming relaxation to break ties along with the idea of 7 promotions used earlier by Kiraly [30] and McDermid [35]. The integrality gap (which we describe later) of the SMTI LP relaxation for one-sided ties is lower bounded by 1 + 1e ≈ 1.3678 [28]. Thus, we cannot obtain better approximation bounds than this using the SMTI LP relaxation. Recently, Huang et al. [20] improved approximation ratio to 22 15 ≈ 1.46667. They start by computing a half-integral stable matching using a modification of the GS algorithm. Following this, they use a charging argument in order to obtain the result. Unlike the previous approaches, here each man is allowed two proposals and each woman can accept up to two proposals. The analysis of the algorithm was further improved by Radnai [39] and the bound was reduced to 41 28 ≈ 1.46428. Currently, this is the best known approximation bound. Inapproximability results determine the bounds beyond which we cannot improve upon the approximation ratios. They are usually tied to two conjectures, namely (i) P 6= NP, and (ii) unique games conjecture (UGC). Obtaining an approximation ratio better than the bound would imply the resolution of the conjectures. The general MAX SMTI problem cannot be approximated to better than 33 29 unless P = NP [46]. Under the much stronger UGC, it cannot be approximated to better than 4 3 [46]. For the special case where the length of a man’s preference list is bounded by 2, the problem is polynomial-time solvable even when women’s preference lists are incomplete and contain ties. On increasing the length of a man’s preference list to 3, the problem becomes NP-hard [22]. If we relax our problem and allow ties only on one side, we can approximate better than the general case. The problem remains NP-hard, and we cannot obtain a lower bound better than unless P = NP [46]. Under the UGC, we cannot obtain a better lower bound than 5 4 21 19 , [19]. For the special case where ties are permitted at the end of preference lists in one-sided ties, Iwama et al. [23] were able to match the UGC lower bound. We cannot improve upon the results of McDermid [35] for cases where ties are allowed on both the sides given the UGC. Thus, a natural question to raise would be: whether we can improve upon the current best known result by [39] of 41 28 for the case where we restrict ties to one side. In this dissertation, we answer this question in the affirmative and show that this can be improved to 1.461538. Stable Allocation. The stable matching problem is a “one-to-one” matching problem, as each man can be matched to at most one woman. The NRMP program has been in use since the 1950s and is a “many-to-one” generalization where each medical student is matched to a hospital and the participating hospitals can accept many students. Here, we can think of each individual to be unit8 sized, while hospitals can be seen as non-unit-sized. Baı̈ou and Balinski [6] went a step further and looked at the case where both sides of the bipartite graph consisted of non-unit-sized entities. Each such entity has a quota q(i), which indicates the number of elements from the other side it should be matched with. Later, they extended this and studied the case where quotas q(i) are arbitrary real numbers [7]. In literature, this has also been known as the “ordinal transportation problem”, as it is similar to the cost-based transportation problem. Dean et al. [12] gave a clever algorithm to solve the problem in O(|E| log |V |) time improving upon the previously known O(|V ||E|) time. These generalizations can be seen as high-multiplicity variants of the stable matching problem. We show that the new approach due to Huang et al. [20] can be considered more naturally in the context of stable allocations, and that this leads to an alternative mathematical framework that may be useful to consider moving forward for developing improved approximations to the SMTI problem. Factor Revealing LPs. We analyze our algorithm with the help of another LP. For purposes of analysis, we are interested in an instance that produces the worst possible approximation factor. We formulate the new LP such that its corresponding polytope captures all possible instances that could come into existence due to the algorithm. Now, we can have the objective function represent the approximation factor of the algorithm for the problem. The task of computing the approximation factor is reduced to finding an instance with the worst approximation factor in the new LP polytope. Such a technique has come to be known as a factor revealing LP. Similar LPs have been used on several occasions [17, 33, 36], before being formally known as they are today. The best known ratio for the facility location problem is due to a combination of factor revealing LPs and other techniques [29]. Of late, this idea has found relevance in several online bipartite matching problems including [37, 15, 16]. Similarly, this technique was used to obtain tighter bounds for the greedy remote-clique problem [8] and secretary problems [10, 11]. 9 Chapter 3 Preliminaries In this chapter, we take a look at the preliminaries helpful in understanding and analyzing the MAX SMTI problem with one-sided ties. In particular, we will be looking at two important relaxations for the SMTI problem based on (i) LP relaxation, and (ii) stable allocation relaxation. 3.1 Symmetric Difference and Augmenting Paths Definition 2. Symmetric Difference. For two graphs G and G0 with common vertex set V, the symmetric difference G∆G0 is the graph containing all the edges that appear in exactly one of G or G0 . For a stable matching instance, let MOP T be an optimal (i.e., maximum cardinality) stable matching and M be any maximal stable matching. If we take the symmetric difference of M ∆MOP T , we get a graph with paths and even length cycles. The edges on any path or cycle alternate between edges from MOP T and M . Definition 3. Augmenting Path. We call an odd length alternating path with start and end edges belonging to MOP T an augmenting path, since along this path, we could always exchange the edges belonging to matching M with edges belonging to matching MOP T . The size of matching M 0 obtained after this transformation is greater than the M by 1. 10 Tie m1 m2 w2 m2 w2 w2 w1 m1 m1 w1 Maximum stable matching Maximal stable matching Figure 3.1: Maximum vs maximal stable matchings. Note that in the context of stable matching, the result M 0 may not be stable. Alternating paths of even length do not affect the size of our matching when toggled this way. As augmenting paths are of odd length, the number of edges in it can be expressed by 2k + 1 for k ≥ 1. For an augmenting (2k + 1)-path, we have k + 1 edges from MOP T and k edges from M . Lemma 4. Let graph G = MOP T ∆M . If G does not contain any augmenting (2k − 1)-paths or shorter, then |MOP T | ≤ k+1 k |M |. Proof. The symmetric difference does not contain any augmenting (2k − 1)-paths or shorter. As a result, the length of the augmenting paths in G is at least (2k + 1). The augmenting paths are disjoint. For an augmenting path, toggling edges in M to edges in MOP T increases the size of our matching by 1. This implies that the difference between |MOP T | and |M | is the number of disjoint augmenting paths p (i.e., |MOP T | − |M | = p). The number of MOP T edges is at least p(k + 1) (i.e., |MOP T | ≥ p(k +1)) as we know that the length of augmenting paths is at least (2k +1). Substituting for p in the above equation, we get |MOP T | − |M | = p ≤ |MOP T | ≤ 3.1.1 k+1 k |MOP T | k+1 . Solving this, we can establish that |M |. Augmenting 3-paths and their Elimination The graph M ∆MOP T does not contain any augmenting 1-paths, as this would block the matching M . The worst approximation ratio we can obtain is when M ∆MOP T consists solely of augmenting 3-paths (paths of length three). Figure 3.1 depicts an augmenting 3-path. The solid edges belong to MOP T while the dashed edge belongs to M . Here, for every edge from M , we have two edges from MOP T . Thus we obtain an approximation ratio of 2. Figure 3.2 is an example where we 11 m4 w4 m3 , m4 w3 , w4 m3 w3 m2 , m3 w3 , w4 m3 w3 (m2 = m3 ) w2 , w3 m2 w2 (m1 = m2 ) w3 , w2 m2 w2 m2 , m1 w2 , w1 m1 w1 m1 w2 , w1 m1 w1 w4 w4 w4 m3 , m4 m4 I m1 II Figure 3.2: Augmenting 7-paths in 2 different configurations. Ties are indicated with parentheses. assume that there exist no augmenting 3 or augmenting 5-paths in the symmetric difference. Hence, we are left with augmenting paths of length 7 and longer. In case of augmenting 7-paths, 3 of the edges come from M , while the remaining 4 edges come from MOP T . Thus, the worst approximation factor we could get here would be 34 . For the general case, McDermid [35] was able to find a solution M for which M ∆MOP T contained no augmenting 3-paths, giving a 3 2 approximation. The runtime 3 2 complexity of this algorithm was O(|V | |E|). Paluch [38] and Király [31] were able to reduce the runtime complexity to O(|E|) while achieving the same approximation ratio. Király [30] introduced the idea of promotions. For the restricted case with one-sided ties, this is powerful enough to prevent any augmenting 3-paths from coming into existence. We start with proposals from single men as in the GS algorithm. Each time a man exhausts his preference list and ends up unmatched, we promote him to the next priority level and let him propose from the start of his list. Initially, each man starts at priority level zero. If a man reaches level i, this implies that he has exhausted his list i times. We run this until all the men are matched or are single and have reached level 2. In case of ties, women accept proposals from men who are at a strictly higher level than others in the tie. Doing so eliminates any augmenting 3-paths in M ∆MOP T . For purposes of contradiction, suppose we end up with an augmenting 3-path as in Figure 3.1. Here m2 is single, and has reached level 2 implying that he must have proposed to w2 after being promoted to level 1. On the other hand m1 never exhausted his list as he is matched to w2 and w1 on his list is single. He remains at level 0 till the end. Thus it is contradictory to assume that w2 accepted a proposal from m1 at level 0 instead of a proposal from m2 at level at least 2. 12 Though useful, this technique fails to eliminate all augmenting 5-paths as we shall see in the next chapter. Iwama et al. [28], Huang et al. [20] and Radnai [39] removed a constant fraction of augmenting 5-paths for the restricted case with one-sided ties. This resulted in approximation ratios of 25 22 17 , 15 and 41 28 . Further, Iwama et al. [23] were able to eliminate augmenting paths of length 3, 5 and 7 for the one-sided case where ties are allowed to occur at the end of a preference list. Thus, by Lemma 4 their algorithm obtains an approximation ratio of 3.2 5 4 for this special case. Linear Programming Relaxation The following integer program (IP) is a generalization of [43] and [41] for the MAX SMTI problem with one-sided ties. For each (mi , wj ) man-woman pair, we introduce a variable xij . The variable takes a value 1 if (mi , wj ) pair is in the matching and 0 otherwise. The set E consists of edges (mi , wj ) based on the preference lists. Sets L and R, consist of participating men and women respectively. Any matching obtained solving the IP will be an optimal stable matching. Objective max X xij (3.1) (mi ,wj )∈E Constraints X xij ≤ 1 ∀mi ∈ L (3.2) xij ≤ 1 ∀wj ∈ R (3.3) ∀(mi , wj ) ∈ E (3.4) ∀(mi , wj ) ∈ E (3.5) wj X mi X xij 0 + wj 0 mi wj X xi0 j ≥ 1 mi0 wj mi xij ∈ {0, 1} Constraints 3.2 and 3.3 indicate that each person can be matched to at most one other person. We will refer to constraint 3.4 as the stability constraint. This constraint states that for every edge (mi , wj ) ∈ E, at least one of mi or wj should be matched to a partner that is equally or more preferred than along this edge, in line with the definition of weak stability. This ensures that both mi and wj are not matched to someone worse than each other and thus helps retain stability. 13 If the stability constraint is satisfied with equality, it is said to be tight. We now state the linear programming (LP) relaxation of the aforementioned IP for sake of completeness. In the relaxation, we remove the restriction that variables xij take integral values. Thus our LP solution might contain xij values that are fractional. Objective max X xij (3.6) (mi ,wj )∈E Constraints X xij ≤ 1 ∀mi ∈ L (3.7) xij ≤ 1 ∀wj ∈ R (3.8) ∀(mi , wj ) ∈ E (3.9) ∀(mi , wj ) ∈ E (3.10) wj X mi X xij 0 + wj 0 mi wj X xi0 j ≥ 1 mi0 wj mi xij ≥ 0 Let the optimal solution obtained by solving the LP be OP TLP and the optimal integral solution be OP TIP . The value of OP TLP may or may not be integral. This implies that each person may be matched completely or fractionally. We know that OP TLP ≥ OP TIP , since the optimal IP solution is still feasible for the LP relaxation. A general technique applied in obtaining an approximation algorithm is to cleverly round the fractional LP solution to obtain a feasible integral solution. In doing so, if the objective does not change by much, we can achieve good approximation factors. Unfortunately, the integrality gap (see Definition 5) of the LP relaxation has been shown to be lower bounded by 1 + 1 e [28]. Thus, this is the best approximation ratio we can hope to achieve by rounding the LP. Definition 5. The integrality gap of an integer program is the worst-case ratio over all instances of the problem of value of an optimal solution to the linear programming relaxation (LP) to value of an optimal solution to the integer programming formulation [45]. 14 3.2.1 Integrality Gap Example For sake of completeness, we present here the instance which gives the 1 + 1e integrality gap [28]. Consider the following instance I1 . m1 : w1 · · · wk wk+1 w1 : (m1 · · · mk ) mk+1 · · · · · · · · · · · · mk : w1 · · · wk w2k wk : (m1 · · · mk ) m2k mk+1 : w1 wk+1 : m1 · · · · · · · · · · · · m2k : wk w2k : mk The integral optimal solution OP TIP = {(mi , wi ) | i = 1, ..., k} has size k. But its corresponding optimal fractional solution OP TLP has the following solution x∗ . • xij = 1 k 1− 1 j−1 k • xi,k+i = 1 − 1 k k • xk+i,i = 1 − 1 − for (i, j) ∈ {1, 2, . . . , k} × {1, 2, . . . , k} for i = {1, 2, . . . , k} 1 i−1 k k Substituting these, we obtain the optimal LP value to be k + k 1 − k1 . The integrality gap is k given by OP TLP /OP TIP which is equal to 1 + 1 − k1 . This expression tends to 1 + 1e as k tends to infinity. 3.2.2 Algorithms using LP Relaxation Solving the SMTI LP (see Section 3.2) gives us a solution that is optimal in the fractional sense. The solution does not necessarily translate to a stable matching that is integral. Iwama et al. [28] were the first to use the x∗ij values obtained by solving the SMTI LP relaxation to break ties while running the GS algorithm to produce an integer stable matching. All the approaches prior to this, promoted men only after they exhausted their entire preference list and remained single. 15 The promotion scheme here is more sensitive. On being rejected, a man is promoted immediately, though fractionally. As before, in case of a tie, a woman accepts a proposal from a man at a strictly higher priority level. Based on LP properties, they were able to show that a constant fraction of augmenting 5-paths are removed due to GS-LP. This enabled them to break the the ratio to 3.3 3 2 barrier and reduce 25 17 . Stable Allocation Relaxation Recent techniques used to improve the approximation ratio for the SMTI problem have similar characteristics to that of the stable allocation problem. Here, we show how to interpret stable matchings in the framework of stable allocations as this may be of use in developing new techniques to improve the approximation factor for the SMTI problem. The stable allocation problem is generally studied in the context of jobs and machines . The jobs belong to the set L and the machines belong to the set R. Each job i has a processing time pi , each machine has a capacity cj and at most uij = min(pi , cj ) units of job i can be assigned to machine j along edge (i, j). The jobs have a strict, transitive and complete preference list over the machines and vice-versa. The goal is to seek an allocation that is stable. We denote by xij the real-valued allocation along the edge (i, j). An edge is considered blocking if the following condidtions hold: (i) xij < uij , (ii) xij 0 > 0 and j i j 0 , and (iii) xi0 j > 0 and i j i0 . The conditions indicate that it is possible to increase allocation along the more preferred edge (i, j), and thus the given allocation cannot be stable. An allocation is a feasible stable allocation if the following conditions are satisfied: X xij = pi , ∀i ∈ L (3.11) xij = cj , ∀j ∈ R (3.12) j∈R X i∈L X pi − xij 0 cj − j 0 i j xij ≥ 0 X xi0 j = 0, ∀ (i, j) ∈ E (3.13) i0 j i ∀ (i, j) ∈ E (3.14) We can redefine the constraints in the context of stable matching by changing pi ’s and cj ’s to 16 1 giving an alternative relaxation of the stable matching problem, which we call the SA relaxation. We will be switching back to the mi , wj terminology used in the context of stable matchings. We now state the stable matching constraints as the stable allocation constraints. A matching is stable and feasible if the following conditions hold X xij ≤ 1 ∀mi ∈ L (3.15) xij ≤ 1 ∀wj ∈ R (3.16) wj ∈R X mi ∈L X 1 − xij 0 1 − x i0 j = 0 ∀ (mi , wj ) ∈ E (3.17) mi0 wj mi wj 0 mi wj xij ≥ 0 X ∀ (mi , wj ) ∈ E (3.18) Constraints 3.15 and 3.16 are exactly the matching constraints. Constraint 3.17 indicates that for an edge (mi , wj ), either mi or wj are matched to someone equally preferred or better. Thus, constraint 3.17 ensures stability in the matching as at least one of mi or wj is matched to someone at least as good as the other. The SA relaxation is indeed a relaxation to the SMTI problem: restricting x variables to integers models exactly the SMTI problem, just like the LP. Hence its optimum is an upper bound on OP TIP . The SA allocation exhibits the following interesting property. If the allocation up to wj on mi ’s list is 1, then the allocation up to mi on wj ’s list may or may not add up to 1. Given an SMTI instance I, we can saturate the nodes by adding dummy nodes to both sides of the bipartite graph. Let the sets be LD and RD . The dummy nodes appear last in the preference list. Any allocation to a dummy node from a regular node implies that the regular node P is unsaturated. Let the function f (x) = xij . i∈L / D ,j ∈R / D Lemma 6. Let x∗ be a fractional solution to the SA relaxation. We can round it to a feasible integral solution x0 for SMTI such that f (x∗ ) ≤ f (x0 ). Proof. Consider an instance I of the SMTI problem. We modify the instance and add dummy nodes on both sides of the bipartite graph such that all the men and women can be saturated. Each dummy node appears at the end of preference lists of members from the opposite set. Now, any unsaturated 17 + m2 w2 − m1 − w1 + Edge-weight 1 2 Figure 3.3: Eliminating a cycle by spinning flow. At each node we increment along an incoming edge and decrement along an outgoing edge or vice-versa. Change in allocation at each node is 0. man/woman can be saturated as they are matched to the dummy partner. Thus, constraints 3.11 and 3.12 are satisfied with equality (due to the dummy nodes catering to the partially matched nodes). We start with a feasible solution x∗ to the SA relaxation. Now, consider the graph G0 = {(mi , wj ) | 0 < x∗ij < 1}. If G0 contains no edges, we have an integral solution. Otherwise, due to our construction G0 contains a cycle as all the nodes are fully allocated. We can eliminate a cycle by spinning flow in clockwise or anti-clockwise direction. After choosing a direction to spin flow, we do the following at each node: increase allocation along the incoming edge and decrease allocation along the outgoing edge. Note that the edges chosen for this are fractional due to the definition of graph G0 . Figure 3.3 is an example of spinning flow across one such cycle. As we increment and decrement by the same amount, the corresponding allocation at the node remains unchanged. The chosen direction should not decrease f (x), as this would imply an increase in allocation to dummy nodes. We can eliminate cycles by shifting allocation in this manner. We continue this till we have eliminated all cycles. What we are left with at the end is an integral solution. Let f (x00 ) be the new allocation after the elimination of a cycle. We have f (x∗ ) ≤ f (x00 ) as we never choose a direction to spin flow that increases allocation to the dummy nodes. At the same time, the increment and decrement at a node only occur along fractional edges. This conserves the allocation at each node. P Consider the initial stable allocation x∗ and an edge (mi , wj ) ∈ G0 . Assume that x∗i0 j = mi0 wj mi P 1 (shown in Figure 3.4). Thus we have xi0 j = 0 due to Constraint 3.16. For purposes of mi wj mi0 P x∗i0 j < 1. We contradiction, suppose that after eliminating a cycle the allocation we have mi0 wj mi know that the increment-decrement does not result in loss of allocation at a node. This implies that the loss in region r1 was compensated by the gain in region r2 on wj ’s list (see Figure 3.4). This 18 wj wj =1 mi r1 mi r2 Figure 3.4: Feasible allocation x∗ for edge (mi , wj ) and 0 < x∗ij < 1. is not possible as there are no edges in G0 from region r2 . Thus, increment-decrement happens in region r1 and the new allocation x00 is still feasible. As we do not lose any increase allocation to dummy nodes in the process of spinning allocation, we have f (x∗ ) ≤ f (x0 ). If x∗ is an optimal solution for the SA relaxation, then due to the fact that it is a relaxation, we have f (x∗ ) ≥ OP TIP . Since Lemma 6 allows us to round x∗ to a feasible integer solution with no decrease in f (x∗ ), this implies the reverse inequality as well, so f (x∗ ) = OP TIP ; otherwise stated: Corollary 7. The SA relaxation has no integrality gap, so approximating the SMTI is equivalent to approximating its SA equivalent. 3.3.1 New Approach by Huang et al. Huang et al. [20] recently gave an algorithm that has a stable allocation interpretation. Unlike the GS algorithm, here each person is allowed to be matched with up to two persons from the other set. As a result, each man has 2 proposals to make and each woman can accept up to 2 proposals. Based on the algorithm, each edge (mi , wj ) ∈ E is given a value as follows: xij = 1 if wj accepts both of mi ’s proposals, xij = 1 2 if wj accepts exactly one proposal from mi and xij = 0 if wj received or accepts no proposals from mi . Using the xij values thus obtained, they create an auxiliary graph G0 such that (mi , wj ) ∈ G0 if xij > 0. This is equivalent to computing a half-integral matching. The construction restricts the maximum degree of G0 to 2. They compute a maximum matching in G0 such that all nodes with degree two are matched. This can be done quickly as G0 consists of disjoint paths and cycles. They show that augmenting 3-paths never come into existense. Due to the construction, no augmenting 5-paths can exist independently, are dependent on other augmenting structures of length 5 and beyond for its existence. Figure 3.5 is an example of a path of length 5 present independently in the auxiliary graph G0 . If rounded incorrectly (diagonal edges 19 w3 m3 w3 m2 m3 w2 w3 m2 w2 (m1 m2 ) w2 w1 m1 w1 m1 Edge-weight 1 2 Figure 3.5: Path of length 5 in the half-integral auxiliary graph G’. Becomes and augmenting path if diagonal edges are chosen in the matching. chosen over horizontal edges), this results in an augmenting 5-path. Given this fractional allocation, we can see that it satisfies constraints 3.15 to 3.17. Finding a maximum matching in which all degree 2 nodes are matched translates to rounding a fractional allocation to an integral one. By Lemma 6, we know that we can round a fractional stable allocation to an integral one without any loss. The maximum matching on G0 gives edges (m1 , w1 ), (m2 , w2 ) and (m3 , w3 ). We started out with a fractional matching of weight 5 2 and rounded it up to 3. Thus, independent paths of length 5 are eliminated. Let M be the matching obtained by running the Huang et al. [20] algorithm and MOP T an optimal matching for the instance under consideration. Let R be the set of all augmenting 5-paths in M ∆MOP T and Q = M ∆MOP T \ R. The set Q now consists of augmenting paths of length 2l + 3 ≥ 7, cycles with at least 2l edges or alternating paths with 2l − 1 edges with l edges from M . As paths in R are not independent, they show a mapping scheme in which each path p ∈ R is mapped to some augmenting path in Q. In turn, using a novel charging scheme, they show that each matched node (from matching M ) in Q can be charged at most 1.5 units and that in an augmenting path q ∈ Q, at most 2lq nodes can be charged. Thus, each such path can be charged at P most cq ≤ 3lq units. Thus, we have |MOP T | = (|MOP T ∩ q| + 3cq ), where we are considering 3 q∈Q P edges from each of the mapped cq augmenting 5-paths. Similarly, |M | = (|M ∩ q| + 2cq ). The q∈Q approximation ratio is maximized by charging augmenting 7-paths for augmenting 5-paths. Thus we have lq = 2. Maximizing over approximation ratio of 22 15 . |MOP T | |M | using the expressions from before, we get the necessary Recently, Radnai [39] improved their analysis by showing that at most 5.5 units can be charged to an augmenting 7-path and reduced the bound further to this is the best known approximation bound. 20 41 28 . Currently, Chapter 4 Counterexamples and Lessons Learned This chapter presents negative results and counterexamples to help streamline future research efforts by avoiding investigation of avenues of attack that may not seem promising. As a reminder, we are trying to design algorithms for the MAX SMTI problem with one-sided ties. 4.1 Iterative and Randomized Rounding The first approach considered is commonly known as iterative rounding. The procedure utilizes properties of extreme point solutions obtained from solving LPs. Any LP can be visualized as optimization over a convex polytope. Polyhedral theory states that optimal solutions to the LP are obtained at extreme points (corners) of the polytope. In its most common form, we start by solving the LP under consideration. Based on some inherent property of the extreme points in the polytope (e.g. we might be able to prove that some variable in an extreme point solution must be equal or close to 1), the value of a variable x is set to 1. Next, we solve the LP with the additional constraint x = 1. We do this iteratively, setting one variable at a time to 1 till we end up with an integral solution. Singh [42] utilized properties of extreme point solutions in order to obtain iterative algorithms with good approximation factors for problems such as the bounded degree spanning tree problem [32]. For example, suppose one could show that for our problem, there exists an edge 21 68 8 8 8 (3 = 4) 7 68 8 8 8 (3 = 4) 7 785 7 7 6 (1 = 2) 7 785 7 7 6 (1 = 2) 7 67 6 6 86 67 6 6 86 5 5 5 75 5 5 5 75 483 4 4 (3 = 4) 483 4 4 (3 = 4) 483 3 3 (3 = 4) 483 3 3 (3 = 4) 271 2 2 (1 = 2) 271 2 2 (1 = 2) 271 1 1 (1 = 2) 271 1 1 (1 = 2) L R L R (a) OP TLP = 8 (b) OP TIP = 7 1 2 1 Figure 4.1: LP and IP solutions to the instance I2 . (mi , wj ) ∈ E for which xij = 1 in an extreme point solution. In such a case, we could remove this pair and match nodes mi and wj and iterate on the remaining instance by fixing the value of xij to 1 for the remaining iterations. Unfortunately, the following is an example where such an approach seems unlikely to succeed. Consider an instance I2 (Figure 4.1). The cardinality of |L| = |R| = 8. Observe that the women have ties in their preference lists. Here, OP TLP = 8 while the OP TIP = 7 (Figure 4.1). In the unique LP optimal solution (Figure 4.1a), each man m ∈ {1, 2, 3, 4, 6, 7, 8} is matched fractionally to two women, each with a weight of 12 . This is indicated by the dashed edges. Man 5 is matched to woman 5 completely (x5,5 = 1). On the other hand, in the unique integral optimal solution to I2 , x5,5 = 0. Thus an attempt to fix x5,5 = 1 and continue with an iterative procedure would result in an unstable matching. A prominent method used commonly in approximation schemes, is to round the fractional LP solution to an integral solution without losing much of its value. Randomized rounding is one such LP rounding method, where we treat the fractional values xij obtained by solving the LP as the probability that edge (mi , wj ) exists. As the fractional values of xij lie between 0 and 1, it is natural to consider these values as probabilities. Each fractional value xij , is rounded up to 22 1 with a probability related to xij and rounded down to 0 with a probability related to 1 − xij . By rounding the LP in this manner, we intend to obtain an integral solution without increasing the expected cost by much. Another rounding approach would be to deterministically round the fractional values based on a threshold. A fractional value is rounded up to 1 if it is above a threshold value and rounded down to 0 otherwise. Two hurdles exist with these rounding techniques. First, it seems difficult to ensure that the stability constraint (Constraint 3.9) for each of the participants is satisfied. Secondly, consider variable x5,5 . In instance I2 , the LP solution gives it a value of 1 while in the IP solution, it has a value of 0. Similarly, variable x7,5 is 0 in the LP solution while it is 1 in the optimal solution. These observations indicate that we have variables xij = 0 that must be rounded up to 1 and xij = 1 that must be rounded down to 0. 4.1.1 α Rounding Teo et al. [43] came up with an elegant rounding procedure for the standard stable matching problem (i.e., no ties, and complete lists), where they choose an arbitrary value α ∈ (0, 1]. Given the fractional values obtained by solving the LP relaxation, for each man mi , we can arrange the allocations to each of the woman in his preference list on a number line with the interval (0, 1]. Let man mi ’s preference list have n women. We now construct n intervals of the type (a, b]. Each interval is of length xij , which indicates the allocation along edge (mi , wj ) by the LP solution. Depending on where α lies in the interval from 0 to 1, each man is matched to the appropriate woman and vice-versa. This is an interesting result as they show that a stable perfect matching exists for any choice of α. Their argument uses the fact that the standard stable matching polytope is integral. We cannot use the same technique here as it may fail to generate a feasible matching. For example, consider the integrality gap instance I1 presented earlier (see Section 3.2.1). We arrange the xij values obtained by solving the LP for the instance, and arrange it on the number line as described. Observe that the preference lists of the first k men are same. Also, their fractional allocation to each of the first k woman in the list is the same. If we were to use this form of “α-rounding” and choose a small enough α, all of the k men would end up getting matched to the same woman. 23 4.2 Deterministic “Primal” LP Rounding Another rounding scheme would be to start with an optimal LP solution x∗ . We now round up certain variables while rounding down the remaining variables. While doing so, we need to keep in mind not to violate any of the LP constraints (Inequalities 3.7 – 3.10). The primary goal is to end up at an integral solution while staying feasible. Based on x∗ , we create an auxiliary graph G0 = {(mi , wj ) | 0 < x∗ij < 1}. If there are no edges in G0 , then we can say that x∗ ∈ {0, 1} and we are done. Otherwise, G0 could have cycles and paths represented by fractional edges. Consider one such cycle. In order to move towards an integral solution, we need to eliminate cycles. A common technique is to spin allocation along the cycle. This is achieved by incrementing and decrementing along incoming and outgoing edges. The increment-decrement step ensures that the matching constraints are satisfied. If we can also satisfy the stability constraints, then we can hope to round the LP solution to an integral one. Consider the instance in Figure 4.2. The solution x∗ij = 1 2 is feasible for the instance. The stability constraints are not tight for all the edges. Based on x∗ , the auxiliary graph G0 is a cycle with each of the edges having a weight of 12 . Let the amount by which we change the allocation along an edge be . We increment-decrement along incoming and outgoing edges by an amount. Let the new vector x0 indicate the modified allocation along the edges. As before, we need to stay feasible enroute to an integral solution. The matching constraints for all the nodes are satisfied as + along the incoming edge is offset by the − along the outgoing edge. With regards to the stability constraint, we have: • Edge (m1 , w1 ) : x01,1 + x02,1 = x∗1,1 + + x∗2,1 − • Edge (m1 , w2 ) : x01,1 + x01,2 + x02,2 = x∗1,1 + + x∗1,2 − + x∗2,2 + • Edge (m2 , w1 ) : x01,1 + x02,1 = x∗1,1 + + x∗2,1 − • Edge (m2 , w2 ) : x02,1 + x02,2 + x01,2 = x∗2,1 − + x∗2,2 + + x∗1,2 − Edge (m2 , w2 ) sees a decrease of 2 towards its stability constraint. Spinning allocation in the opposite direction causes edge (m1 , w1 ) to lose 2 towards its stability constraints. Thus, spinning allocation long enough could lead to infeasibility. 24 + w1 w2 m2 w2 (m1 m2 ) − − w1 w2 m1 w1 (m1 m2 ) + Edge-weight 1 2 Figure 4.2: Instance where allocation to men in ties is equal. The fractional edges form a cycle. 4.3 Stable Allocation Variants The insight from Section 3.3 is useful as we show that the SA relaxation has no integrality gap. The provides another viable route towards coming up with an approximation algorithm, by focusing on obtaining a good fractional solution to the SA relaxation (since it can be rounded without any loss). Huang et al. [20] do this, coming up with an algorithm that generates a fractional (in fact 1/2-integral) matching that is provably better than or equal to some fraction of the OP T . We note that their algorithm seems to balance load across ties, and wonder if a similar approach would succeed where we go beyond half-integrality and instead generate a fully continuous solution to the SA relaxation via a GS variant that enforces this sort of balance across ties. By balancing load across a tie in [20], a woman retains her partners instead of having to take a decision immediately by breaking the tie. Having two choices did help in improving the approximation ratio. Following this line of thought, we consider the case where we balance load across a tie by not limiting the number of accepted partners to just 2, while staying feasible by not violating the matching constraints. Consider woman wj , and her tied region T with n ≥ 2 men. Assuming that all the tied men proposed with 1 unit of allocation at some point during the GS algorithm, woman wj accepts the allocations from each of the proposals equally. For example, if m1 proposed with 1 unit, he is matched completely. Later when m2 proposes with 1 unit, the amount of allocation accepted from m1 and m2 is adjusted and is 1/2. On receiving a proposal with 1 unit of allocation from the ith person in the tie, the allocation accepted from each of the i persons is given by 1i . Assuming all the n men in the tie T proposed, we have {x∗ij = true even if each of the men proposed with an allocation of at least 1 n. 1 n | 1 ≤ i ≤ n}. This is This natural generalization helps a woman retain more suitors, as any fractional allocation along an edge, i.e., (mi , wj ) with 25 a01 a1 a01 (b1 , ..., bn ) a1 a02 a2 a02 (b1 , ..., bn ) a2 ..... ..... an−1 a0n−1 (b1 , ..., bn ) an−1 b01 , a01 , ..., a0n−1 b1 b01 (b1 , c1 , ..., cn−1 ) b02 , a01 , ..., a0n−1 b2 b02 (b2 , c1 , ..., cn−1 ) ..... ..... b0n , a01 , ..., a0n−1 bn b0n (bn , c1 , ..., cn−1 ) b01 , b02 ,...,b0n , c01 c1 c01 c1 b01 , b02 ,...,b0n , c02 c2 c02 c2 ..... ..... a0n−1 b01 , b02 ,...,b0n , c0n−1 cn−1 c0n−1 cn−1 Men Women Figure 4.3: An instance where continuous rounding fails 26 x∗ij > 0 results in an edge in the residual graph. While rounding the solution x∗ , having more edges could be beneficial. A possible rounding scheme could be to obtain a matching in which all the fully matched nodes stay matched. In the instance given by Figure 4.2, this results in each man being accepted fractionally (xij = 21 ) by women w1 and w2 . Consider the instance in Figure 4.3. There are a total of 3n − 2 nodes on both the sides, divided into three groups. The optimal solution has size 3n − 2 and would be to match edges of type a − a0 , b − b0 and c − c0 . Only edges of type b − a0 block such a matching, but they do not here. Assume that the GS algorithm tries to match the c nodes, followed by the b and a nodes. Observe that the continuous version allocates 1 n to all the edges except for edges of type a − a0 and c − c0 . The size of the matching we obtain due to this will be 2n − 1. Thus the approximation ratio we obtain is 3n−2 2n−1 ≈ 32 . Another alternative rule would be for a woman to equalize the ratio between the amount of allocation rejected to the amount of allocation accepted from men who propose and are in a tie. In this case, we see that very few a − a0 and c − c0 edges have non-zero allocation. All other a − a0 and c − c0 edges have zero allocation. Thus, modifying the acceptance scheme in case of a tie does not increase the size of the matching in our favor. 27 Chapter 5 Factor-Revealing LP and 19 13 Approximation In this chapter, we describe the GS-LP algorithm due to Iwama et al. [28] and analyze it using a factor revealing LP (LPf ). Its optimal solution is an upper bound for the approximation ratio obtained by GS-LP. 5.1 GS-LP Algorithm We now describe the GS-LP algorithm (see Algorithm 1). Our interpretation of the algo- rithm diverges slightly from the original, as it aids in the simplification of our LPf formulation. As before, we start by solving the LP relaxation described in Section 3.2 to obtain an optimal solution x∗ . Each man m ∈ L is assigned integer and fractional priority levels Pi (m) ∈ {0, 1, 2, . . .} and Pf (m) ∈ [0, 1]. Both start at zero. Pi (m) indicates the number of times man m has exhausted his preference list. The standard GS algorithm is quite straightforward: in each iteration, an arbitrary unassigned man proposes to the next woman on his preference list. Each woman tentatively holds on to the best proposal received thus far, rejecting all other offers. When a man is rejected, he continues issuing proposals down his list. Each edge is proposed along once, leading to an O(|E|) running time for the standard GS algorithm. In our extended version of the GS algorithm, priorities are used to 28 break ties: if w is tentatively matched with m but receives a proposal from a man m0 for which m =w m0 , she accepts if Pi (m0 ) > Pi (m), or if Pi (m) = Pi (m0 ) but Pf (m0 ) > Pf (m). Let w(m) denote the farthest woman down m’s preference list up to whom he has proposed; initially, w(m) = ∅. Let j(m) be the index on m’s preference list; initially, j(m) = ∅. Now, wj(m) represents the woman at index j(m) on m’s preference list. The fractional priority for m is defined as P Pf (m) = wm w(m) x∗mw . Every time m is rejected by w(m), he returns to the start of his preference list given by j(m) = 1 and has another chance to propose to women up to w(m) in sequence (he may be more successful this time, since his fractional priority now includes the contribution due to x∗m,w(m) ). If m is rejected by the last woman on his list, he starts proposing to all the women from P ∗ the beginning of his list one final time with Pf (m) = xmw . On being rejected by all the women w now, he increments the value of his integer priority Pi (m), resets w(m) to ∅ (so Pf (m) resets to zero as well), and restarts the proposal process from the beginning of his list. Men whose priorities reach a specified level n cease to issue any further proposals, and the algorithm terminates when all unmatched men satisfy Pi (m) = n. The parameter n = d+1 2 , where d is the length of the longest augmenting path being considered in the LPf . Iwama et al. use n = 3 as they consider up to augmenting 5-paths. 29 Algorithm 1 Modified GS-LP Algorithm Input: An SMTI Instance I Output: A Stable Matching M 1: For the given instance I formulate the integer program (IP) 2: Solve its corresponding LP relaxation and let the optimal solution obtained be x∗ 3: Let M := ∅ 4: ∀m ∈ L, let Pi (m) := 0, Pf (m) := 0, w(m) = ∅, j(m) = ∅ 5: while There exists a man m such that M (m) = ∅ and Pi (m) < n do 6: Let m be such a man P 7: if m proposed to all the women on his list with Pf (m) = x∗mw then w 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: Pi (m) ← Pi (m) + 1 Pf (m) ← 0 w(m) ← ∅ j(m) ← ∅ end if if m has not yet proposed to wj(m) in the current level then Pf (m) ← Pf (m) + x∗mwj(m) end if Let man m propose to woman wj(m) if woman wj(m) accepts m’s proposal then 0 if woman wj(m) is matched with some man m then 0 M ← M \ (m , wj(m) ) ∪ (m, wj(m) ) else M ← M ∪ (m, wj(m) ) end if end if if m proposed to wj(m) for the 1st time in the current level then w(m) ← j(m) j(m) ← 1 else j(m) ← j(m) + 1 end if end while 5.1.1 Useful Observations For an SMTI instance I, we define M to be the set of edges that come into existence after running GS-LP, and MOP T to be the set of edges in an optimal matching for the same instance. Matching M is stable. For purposes of contradiction, suppose edge (m, w) ∈ E \ M is blocking. If w never received any proposals from m, then M (m) m w. If m did propose w and was rejected, it must be the case that she is matched to m0 such that m0 w m. As the algorithm progresses, w can only be matched to someone atleast as good as m0 . Thus we have M (w) w m0 w m. It is thus evident that edge (m, w) cannot block M . The approximation ratio for the GS-LP algorithm is maximized when M ∆MOP T contains short augmenting paths. The ideas from Section 3.1.1 can be 30 Tie m1 m2 w2 m2 w2 w2 w1 m1 m1 w1 MOP T edge M edge Figure 5.1: Augmenting 3-path w3 w3 m3 w3 (m2 = m3 ) w2 , w3 m2 w2 (m1 = m2 ) w3 , w2 m2 w2 m2 , m1 w2 , w1 m1 w1 w2 , w1 m1 w1 m1 m3 I w3 (m2 = m3 ) m1 II Figure 5.2: Two examples of augmenting 5-paths eliminated due to GS-LP algorithm. seen as the integer prioritization version of the GS-LP algorithm. The idea is applicable here and augmenting 3-paths are non-existent in M ∆MOP T as a result. Proposition 8 proves the same in the context of the GS-LP algorithm. Proposition 8. There are no augmenting 3-paths in M ∆MOP T . Proof. Figure 5.1 is an example of an augmenting 3-path. Man m2 and woman w1 are unmatched (single) in M . Man m1 never exhausted his preference list as woman w1 on his preference list is single in M ; hence, Pi (m1 ) = 0. Since man m2 remains single, Pi (m2 ) = 3 at termination. At some point, m2 would have proposed to w2 with Pi (m2 ) = 2, and therefore w2 would have accepted m2 over m1 . Proposition 9. For n ≥ 3, there are no augmenting 3-paths in M ∆MOP T , and all augmenting 5-paths appear in the configuration shown in Figure 5.3. Integer prioritization is not sufficient to prevent all augmenting paths of length 5 or longer, 31 w3 r1 r2 w2 r1 r2 r3 m3 w3 r1 r2 w3 r3 w2 r1 r2 m2 r3 r4 r5 m1 m2 r4 r5 m2 w2 r1 r2 w1 r3 m3 r3 m1 r4 r5 m1 w1 r1 MOP T edges r2 r3 M edges Figure 5.3: The unique configuration of an augmenting 5-path. although it can prevent some types of paths from arising. Figure 5.2 gives two such paths eliminated by the GS-LP algorithm. For an augmenting 5-path not eliminated by GS-LP the following holds: we cannot have m3 =w3 m2 or m3 w3 m2 ; in the first case, m3 would eventually gain enough priority to win the tiebreak, and in the second case, (m3 , w3 ) would block M . Hence, m2 w3 m3 , and therefore w2 m2 w3 or else (m2 , w3 ) would block MOP T . In addition, w2 m1 w1 , since otherwise (m1 , w1 ) would block M . Finally, m1 =w2 m2 , since if m1 w2 m2 then (m1 , w2 ) would block MOP T , and if m2 w2 m1 then (m2 , w2 ) would block M . Further, the preference lists of the men and women are partitioned into regions; for example in Figure 5.3, for man m1 : region r1 consists of all the edges (to women) m1 prefers to w2 , region r2 consists of the single edge (m1 , w2 ), region r3 consists of edges beyond (m1 , w2 ) which have not been proposed along, and so on. Note that a lower region number implies higher preference. This translates directly to the edges it contains. 5.2 Longer Augmenting Paths and Validity An augmenting i-path has i+1 2 men and women, numbered m1 . . . m(i+1)/2 and w1 . . . w(i+1)/2 along the path as in Figure 5.3. In an augmenting path, edges of the form (mk−1 , wk ) appear in M and edges (mk , wk ) appear in MOP T . Woman w1 and man m(i+1)/2 are single in M . As opposed to augmenting 5-paths, augmenting 7-paths and beyond can appear in several different types of configurations; two configurations for augmenting 7-paths are shown in Figure 5.4. A configuration of an augmenting i-path is specified by the following: • For each man mk with k < i+1 2 , is M (mk ) mk MOP T (mk ) or M (mk ) ≺mk MOP T (mk )? 32 m4 w4 m3 , m4 w3 , w4 m3 w3 m2 , m3 w3 , w4 m3 w3 (m2 = m3 ) w2 , w3 m2 w2 (m1 = m2 ) w3 , w2 m2 w2 m2 , m1 w2 , w1 m1 w1 m1 w2 , w1 m1 w1 w4 w4 I w4 m3 , m4 m4 m1 II Figure 5.4: Augmenting 7-paths due to 2 different configurations • For each woman wk with k > 1, is M (wk ) wk MOP T (wk ), M (wk ) =wk MOP T (wk ), or M (wk ) ≺wk MOP T (wk )? • For each woman wk with k > 1, does the region of wk ’s preference list containing M (wk ) have any edges to singles or not? Based on where M and MOP T occur in the preference list of a man and woman, we can distinguish the preference lists into different types as shown in Figure 5.5. The factor-revealing LP considers all possible structures occurring in M ∆MOP T collectively. Thus, we need to generate valid configurations for each of the augmenting paths being considered. As augmenting paths may exist in different configurations, the following section describes a method to generate valid instances of augmenting paths up to the desired length. 5.2.1 Priorities Definition 10. (High and low priority for men). Man mj in a path in M ∆MOP T has high priority if Pi (mj ) ≥ j at termination; otherwise mj has low priority. Definition 11. (High and low priority for women). Any women wj matched in a path in M ∆MOP T has high priority if and only if M (wj ) = mj−1 has high priority; otherwise she has low priority. The priority bit is set if the corresponding men and women have high priority. The following facts are easy to establish: 33 1. Pi (mj ) = 0 (and hence mj has low priority) if he is matched in MOP T to a woman (wj ) unmatched in M . Otherwise, he would have proposed to wj , and she would have accepted. Since matched women stay matched, wj could not have ended up single. 2. Man mj has Pi (mj ) = 0 (and hence has low priority) if mj wj mj−1 , for similar reasons. Otherwise, mj would have proposed to wj and she would have accepted, preventing the edge (mj−1 , wj ) from being part of M . 3. If mj has high priority and mj =wj mj−1 , then mj−1 has high priority, since Pi (mj ) ≥ j implies that mj must have proposed to wj at least at priority level j − 1. Hence, Pi (mj−1 ) ≥ j − 1, or else there is no way wj could have accepted mj−1 in the matching M . 5.2.2 Valid Augmenting Paths Except for single men and women in M , every person on an augmenting path has an optimal (O) partner and a GS-LP (G) partner due to the algorithm. As before, we partition the preference lists of men and women into regions. Once partitioned, the preference lists of men and women fall into one of the types shown in Figure 5.5. The men and women in M may have a high priority implied by (p), at the end of GS-LP. Note that men do not have any ties. Based on this, we can observe that the preference list of a man can belong to one of the following types from Figure 5.5: 1, 2, 5, 7, 8, 9, 10, 13 and 14. For example, his preference lists cannot belong to type 4 or 6 since his preference list does not contain a tie. Similarly, for women, the preference lists belong to one of the following types: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 13 and 14. Preference lists of type 11, and 12 indicate that the corresponding men and women belong to set the of persons who are single in both M and MOP T . In the construction of any augmenting i-path there are three invariants. Woman w1 is single and her preference list is given by Figure 5.5, type 3. The preference list of m1 can be one of the two choices: type 1 or type 2 (see Figure 5.5). It cannot be of type 2 as it m1 and w1 would prefer each other thereby blocking M . Man m i+1 is single in M , has a high priority and his preference 2 list is given by type 5 (see Figure 5.5). This in turn imposes a restriction on the preference list of his optimal partner w i+1 . Her preference list cannot be of types 2, 4, 8, or 13. If her preference 2 list is of type 2, then this blocks M ; if type 4, m i+1 gains high enough priority to break the tie; if 2 type 8, then it blocks M ; and her preference list cannot belong to type 13 as in both M and MOP T 34 —r1 — —r2 G— —r3 — —r4 O— —r5 — —r1 — —r2 O— —r3 — —r4 G— —r5 — 1 2 —r1 — —r2 O— —r3 — —r1 — —[r2 T ]— —r3 — 3 4 —r1 — —r2 O— —r3 — (p) —r1 — —[r2 T ]— —r3 — (p) 5 6 —r1 — —r2 G— —r3 — —r4 O— —r5 — (p) —r1 — —r2 O— —r3 — —r4 G— —r5 — (p) 7 8 —r1 — —r2 G— —r3 — —r1 — —r2 G— —r3 — (p) 9 10 —r1 — —r1 — (p) 11 12 —r1 — —r2 G, O— —r3 — —r1 — —r2 G, O— —r3 — (p) 13 14 Figure 5.5: Different types of preference lists. G and O indicate the GS-LP and optimal edges. T indicates a tie, and (p) indicates that the particular person has high priority at the end of GS-LP. 35 T (p) O G O G O m3 w3 m2 w2 m1 w1 T (p) MOP T edges O M edges Figure 5.6: Invalid Augmenting 5-path partners are the same for w i+1 which contradicts the fact that m i+1 is single in M . Her preference 2 2 list could be of type 1, 6 or 7. With these conditions we can start building our augmenting i-path. Consider the construction of an augmenting path as a game where each person chooses his or her response one at a time. The chosen response should not block M or MOP T as we are interested in building a valid augmenting path. Simultaneously, this should not contradict the tie breaking behavior of GS-LP. For the edge (mk−1 , wk ) ∈ M we cannot have preference lists where both mk−1 and wk strictly prefer their M partners over their MOP T partners (e.g. Figure 5.5, type 1). Similarly for the edge (mk , wk ) ∈ MOP T we cannot have preference lists where both mk and wk prefer each other to their M partners (e.g. Figure 5.5, type 2). We build the path bottom up starting with w1 and ending with m i+1 . A woman wk chooses her response based on the response chosen by her M partner man 2 mk−1 . Similarly man mk chooses his response based his MOP T partner woman wk ’s response. We start building the augmenting path from woman w2 all the way up to woman w i+1 2 alternating between men and women. Building paths using this procedure will generate augmenting paths, some of which are invalid. Figure 5.6 is one such example of an invalid augmenting 5-path obtained by following the construction procedure. The preference lists of women w2 , and w3 are of type 6 as seen in Figure 5.5. Woman w2 has high priority, which implies man m1 also has high priority. This cannot be true as this implies Pf (m1 ) ≥ 1 by Definition 10, which in turn implies that he sent a proposal to w1 . Such a proposal would have been accepted by w1 . When considering augmenting paths up to length d, we generate all possible configurations for each augmenting i-path, i = 5, 7, . . . , d. 36 We use what we call priority propagation in order to check the validity of each augmenting path thus generated. We follow a top-down approach starting with man m i+1 . For an augmenting i2 path, man m i+1 (single in M ) is given a high priority. Using the facts from above, we can determine 2 the priority levels of the men and women based on the augmenting path structure. If at any point, we come across a case where we contradict one of the above facts or have Pi (m1 ) > 0, we can conclude that the augmenting path is invalid. 5.3 5.3.1 Towards a Factor-Revealing LP R-Edges and Invalid Edges Using ideas from Section 5.2.2, we can generate augmenting paths of different lengths. The graph M ∆MOP T contains augmenting structures of different lengths. When considering up to augmenting d-paths, all the M edges that form a part of some augmenting i-path, where i ≤ d, are accounted for. Let the set Mi indicate all the edges in M occurring in augmenting i-paths from all the possible configurations. The remaining M edges not accounted for in any of the augmenting d S paths are represented by the set R, i.e., R = M \ Mi . We can generate the different types of R i=5 edges as follows: start with a valid preference list type for a man (e.g. he cannot have a tie) from Figure 5.5. For the chosen type, we need to ensure that the corresponding preference list chosen for the woman does not block M or contradict the tie-breaking behavior of GS-LP. Following this, each edge (m1 , w1 ) ∈ R can be categorized into one of 16 types shown in Table 5.1. Note that our instance could have men and women who are single in both M and MOP T . Every edge in M ∆MOP T is either in an augmenting path or is part of an R edge. The endpoints of an edge in E can now be represented by regions based on the preference lists of individuals. Let ZL and ZR represent the sets of all regions from men and women across all preference list types from all possible configurations i.e., augmenting paths, and R edges. An endpoint of an edge is characterized by three parameters: (a) augmenting structure (could be configuration type in case of an augmenting path, an R edge or single), (b) person (e.g. m1 , m2 or m3 in case of an augmenting 5-path, m1 in case of an R edge, etc), and (c) the region the edge belongs to (e.g. region r1 ). An edge e = (m, w) ∈ E, can now be represented by its left endpoint according to its type l(e) ∈ ZL and its right endpoint according to its type r(e) ∈ ZR . Note that not every pair of regions (l, r) ∈ ZL × ZR is valid in its ability to describe edges in E. The following (l, r) pairings are invalid, 37 Table 5.1: Different types of R edges. These edges come into existence due to GS-LP. Columns 2 and 3 represent the corresponding preference lists of man m1 and woman w1 . Type 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Man m1 —r1 — —r2 G— —r3 — —r4 O— —r5 — (p) —r1 — —r2 G— —r3 — —r4 O— —r5 — (p) —r1 — —r2 O— —r3 — —r4 G— —r5 — (p) —r1 — —r2 O— —r3 — —r4 G— —r5 — (p) —r1 — —r2 O— —r3 — —r4 G— —r5 — (p) —r1 — —r2 O— —r3 — —r4 G— —r5 — (p) —r1 — —r2 G— —r3 — —r4 O— —r5 — —r1 — —r2 G— —r3 — —r4 O— —r5 — —r1 — —r2 O— —r3 — —r4 G— —r5 — —r1 — —r2 O— —r3 — —r4 G— —r5 — —r1 — —r2 O— —r3 — —r4 G— —r5 — —r1 — —r2 O— —r3 — —r4 G— —r5 — —r1 — —r2 G— —r3 — —r1 — —r2 G— —r3 — —r1 — —r2 G— —r3 — (p) —r1 — —r2 G— —r3 — (p) Woman w1 —r1 — —[r2 T ]— —r3 — (p) —r1 — —r2 O— —r3 — —r4 G— —r5 — (p) —r1 — —[r2 T ]— —r3 — (p) —r1 — —r2 G— —r3 — —r4 O— —r5 — (p) —r1 — —r2 O— —r3 — —r4 G— —r5 — (p) —r1 — —r2 G— —r3 — (p) —r1 — —[r2 T ]— —r3 — —r1 — —r2 O— —r3 — —r4 G— —r5 — —r1 — —[r2 T ]— —r3 — —r1 — —r2 G— —r3 — —r4 O— —r5 — —r1 — —r2 O— —r3 — —r4 G— —r5 — —r1 — —r2 G— —r3 — —r1 — —[r2 T ]— —r3 — —r1 — —r2 O— —r3 — —r4 G— —r5 — —r1 — —[r2 T ]— —r3 — (p) —r1 — —r2 O— —r3 — —r4 G— —r5 — (p) since there cannot be any edges e ∈ E satisfying l = l(e) and r = r(e): • Region pairs (l, r) corresponding to edges that would block M , • Region pairs (l, r) corresponding to edges that would block MOP T , • Region pairs (l, r) corresponding to edges from a high-priority man m to a woman w, such that m w M (w) (this is not possible since Pi (m) > 0, so m would have proposed to w, and she would have accepted, preventing later acceptance of M (w)), Let B ⊂ ZL × ZR denote the set of all valid region pairs that remain after filtering the invalid pairs enumerated above. Note that every edge appears in exactly one such pair (l(e), r(e)) ∈ B. 5.3.2 Notation The factor-revealing LP (LPf ) considers all possible configurations that collectively occur in M ∆MOP T . When considering matching M due to GS-LP, we are particularly interested in a matching of minimum possible cardinality for the instance under consideration. Note that it is crucial that the structures collectively cover all the nodes in any instance exactly once. We start by partitioning the M ∆MOP T into “configurations” - for example an augmenting i-path can come into existence from various configurations. Figure 5.4 is an example of an augmenting 7-path due to two 38 different configurations. We use techniques from Sections 5.2.2 in order to obtain valid configurations of augmenting i-paths. Let C denote the set of configurations present in M ∆MOP T , and Ĉ represent the “truncated” set of configurations we obtain when we aggregate larger structures together into the set R. Suppose we consider augmenting i-paths for i = 5, 7, 9, . . . , d. The M edges not related to any of the augmenting paths under consideration are truncated to be a part of the set R (defined in Section 5.3.1). The set Ĉ thus contains all the configurations from the augmenting paths and the R edges. We now define the necessary variables useful in setting up the factor-revealing LP. • Let nc denote the number of instances of configuration c. • Let gc denote the number of M edges (due to GS-LP) belonging to a single instance of configuration c. • Let fc denote the approximation factor associated with the configuration c (e.g. fc = 3 2 if c represents an augmenting 5-path). • Let αc = nc |M | . • Define βlr = 1 |M | P x∗e , where x∗ is an optimal solution to the original LP. That e∈E:l(e)=l,r(e)=r is, βlr denotes the aggregate weight assigned by LP (I) to the bundle of edges from region type l ∈ ZL to region type r ∈ ZR . The decision variables for the factor-revealing LP are the α’s and β’s. Note that they are normalized by the cardinality of the matching M given by the GS-LP algorithm. Also, any edge considered from now on is a valid edge, i.e.,(l(e), r(e)) ∈ B. 5.3.3 The Factor-Revealing LP The factor-revealing LP is now defined as: ( LPf = max min ! P βlr , (l,r)∈B P fc gc αc ) : (α, β) ∈ P , c∈Ĉ where P denotes the feasible region, described with the following 5 types of constraints. • Matching constraints: For any man or woman in configuration c ∈ Ĉ, the sum of βlr over all incident edge types (l, r) ∈ B is at most αc . 39 • Stability constraints: For any greedy or optimal edge known to exist in configuration c ∈ Ĉ, the sum of βlr over all relevant edges (l, r) ∈ B is at least αc . • Tie-breaking constraints: For any woman from a configuration c ∈ Ĉ with a tied region, the sum of βlr for all incoming greedy edges is at least the sum of βlr for all incoming optimal edges. • Simplex constraint: P gc αc = 1. c∈Ĉ • Non-negativity constraint: α, β ≥ 0. The factor-revealing LP uses the truncated set of configurations Ĉ instead of the full set of configurations C in M ∆MOP T . We now formally state and prove the validity of these different constraints. For simplicity, we start with original SMTI LP constraints and transform them to one given by α’s and β’s. The LPf considers each configuration collectively. Thus, it suffices to write the constraints for each configuration c instead of all its corresponding instances in M ∆MOP T . Let Z(m) denote the regions in m’s preference list. Lemma 12. P βlr ≤ αc , ∀m ∈ c, ∀c ∈ Ĉ (l,r)∈B:l∈Z(m) P βlr ≤ αc , ∀w ∈ c, ∀c ∈ Ĉ (l,r)∈B:r∈Z(w) Proof. This constraint is the equivalent to the matching constraints (3.7 and 3.8). Consider some man m ∈ c, c ∈ Ĉ. There are nc instances where he is present (e.g. man m1 occurring in all instances of the unique augmenting 5-path configuration). The matching constraint (3.7) states that P x∗e ≤ 1. Let L(m) denote the set of nc individual men in L represented by the generic e=(m,w0 )∈E man m. Since each individual man m0 ∈ L(m) satisfies the constraint in LP (I), and summing over P all the nc instances, we have x∗e ≤ nc . e=(m0 ,w)∈E m0 ∈L(m) Therefore, 1 |M | P e=(m0 ,w)∈E m0 ∈L(m) x∗e = P (l,r)∈B l∈Z(m) βlr ≤ nc |M | = αc . The proof for any woman w in c is similar. Let the notation l0 l indicate that l0 is a region more preferred than l. The following Lemma is the aggregated equivalent of the stability constraint (3.9). For a woman w belonging to configuration c, let R(w) denote the set of individual women in R represented by the generic woman w. 40 m w m0 w m w0 Augmenting path edge Figure 5.7: Edge belonging to M ∆MOP T . | | indicates a tied region. P Lemma 13. P βl 0 r 0 + (l0 ,r 0 )∈B:l0 l βl0 r0 ≥ αc , ∀(l, r) ∈ c, c ∈ Ĉ (l0 ,r 0 )∈B:r 0 r Proof. Consider an edge e = (m, w) ∈ M ∆MOP T . It is part of a configuration c in M ∆MOP T . By P P the stability constraint, we have x∗mw0 + x∗m0 w ≥ 1 (see Figure 5.7). Let l(e) = l and w 0 m w m0 w m r(e) = r. Since the edge is stable in all of the nc instances, aggregating over the instances we have P e=(m,w0 )∈E w 0 m w m∈L(m) Therefore, 1 |M | x∗e + P e=(m0 ,w)∈E m0 w m w∈R(w) x∗e ≥ nc P ∗ e=(m,w0 )∈E xe + w 0 m w m∈L(m) P e=(m0 ,w)∈E m0 w m w∈R(w) x∗e = P βl 0 r 0 + (l0 ,r 0 )∈B:l0 l P βl 0 r 0 ≥ (l0 ,r 0 )∈B:r 0 r nc |M | = αc . Lemma 14. In an augmenting path consider the following: woman wk has a tie in her preference list (type 4), M (wk ) = mk−1 has a preference list of type 1, MOP T (wk ) = mk has a preference list of type 2, and M (mk−1 ) = wk−1 has a preference list of type 2 or 3, shown below. I. mk : —O—G— wk : —T— mk−1 : —G—O— wk−1 : —O— mk : —O—G— wk : —T— mk−1 : —G—O— wk−1 : —O—G— II. Let lk−1 represent the region containing MOP T (mk−1 ), and lk represent the region containing M (mk ). Then the following holds: P βlr ≤ (l,r)∈B llk P (l,r)∈B llk−1 r ∈Y / 41 βlr Proof. Note that Pi (mk−1 ) = 0, since mk−1 never proposed to wk−1 . Therefore, the fact that wk chose mk−1 over mk implies that Pi (mk ) = 0 and that Pf0 (mk ) ≤ Pf0 (mk−1 ), where Pf0 denotes Pf at the last point in time at which the algorithm broke the tie between mk and mk−1 . Since mk is matched to wk+1 in M , his fractional priority at that point in time was equal to the weight of all the entries in his preference list prior to wk+1 : Pf0 (mk ) = x∗e P e=(mk ,w)∈E wm wk+1 k Since mk−1 never proposed to woman wk−1 and also never proposed to any woman w matched in M to a man m for which mk−1 w m, we have Pf0 (mk−1 ) ≤ x∗e , P e=(mk−1 ,w)∈E wm w k−1 k−1 r(e)∈Y / where Y ⊂ ZR denotes the set of all regions in ZR that are more preferred than partner in M within their same list. Combining these, we have Pf0 (mk ) = x∗e ≤ Pf0 (mk−1 ) ≤ P e=(mk ,w)∈E wm wk+1 k P x∗e , e=(mk−1 ,w)∈E wm w k−1 k−1 r(e)∈Y / which aggregated across all instances of the configuration, we have P βlr ≤ (l,r)∈B llk Lemma 15. P P βlr . (l,r)∈B llk−1 r ∈Y / gc αc = 1 c∈Ĉ Proof. For a configuration c, let Mc be the number of edges in M (I) across all the instances of P configuration c in M ∆MOP T . By definition, we have |Mc | = gc nc . Substituting |Mc | for c∈Ĉ P P |M |, we have |M | = |Mc | = gc nc . Normalizing by |M | on both sides, we obtain the desired cĈ cĈ equation. 42 5.3.4 Analysis Lemma 16. For any instance I, let x∗ be the optimal LP solution of LP (I), let M be a minimum cardinality matching we obtain by running GS-LP on I, and let α, β be defined as above using M and x∗ . Then (α, β) ∈ P . Proof. Non-negativity of α and β is clear. The validity of the corresponding LPf constraints is shown by Lemmas 12, 13, 14, and 15. Theorem 17. Let A(I) denote the approximation ratio of GS-LP when run on instance I. Then LPf ≥ A(I). Proof. Let x∗ be the optimal solution to LP (I), let M be the minimum cardinality matching we obtain due to GS-LP on instance I, and let α, β be defined as above using M and x∗ . Lemma 16 shows that (α, β) ∈ P . We now have P βlr = (l,r)∈B 1 |M | P x∗e = e∈E 1 |M | LP (I) ≥ 1 |M | |MOP T (I)| = A(I), and P c∈Ĉ fc gc αc ≥ P fc gc αc = c∈C 1 |M | P fc gc nc = c∈C 1 |M | |MOP T (I)| = A(I). Hence, ! min P (l,r)∈B βlr , P fc gc αc ≥ A(I), c∈Ĉ so we have demonstrated a feasible solution of LPf with objective value at least A(I). The maximum value of LPf is therefore an upper bound for A(I). 5.3.5 Approximation Ratios Iwama et al [28] obtained an approximation ratio of 25 17 by considering up to augmenting 5-paths. We extended their approach by considering larger augmenting structures. As the LPf takes into consideration truncated configurations, we let Cˆd be the set of truncated configurations when considering up to augmenting paths of length d. Note that the R edges are represented by the truncated sets. Also, R ⊂ Cˆd ⊆ Cˆd0 where d ≤ d0 . Let LPf (d) represent the factor-revealing LP for 43 the truncated set Cˆd , and let A(d) be the approximation ratio obtained by solving LPf (d). Table 5.2 gives the approximation ratios for LPf with different d’s up to length 9. Table 5.2: Approximation Ratios for LPf (d) d 5 7 9 α5 7/17 5/13 5/13 α7 1/13 1/13 α9 0 αR 1/17 0 0 A(d) 1.47058 1.461538 1.461538 The following lemma shows that LPf exhibits a monotone property. Lemma 18. LPf (d) ≥ LPf (d + 2). Proof. This follows from the fact that LPf (d) is a relaxation of LPf (d + 2). That is, the optimal solution (α, β) for LPf (d + 2) has a natural mapping to a feasible solution (α0 , β 0 ) for LPf (d). We can breakdown the augmenting structures in Ĉd+2 \ Ĉd into the constituent R edges. Thus, we have LPf (d) ≥ LPf (d)(α0 , β 0 ) ≥ LPf (d + 2). Lemma 19. If the optimal solution of LPf (d) assigns αc = 0 for all c ∈ R, then LPf (d) = LPf (d0 ) for all d0 ≥ d. Proof. By the monotonicity property due to Lemma 18, we know that LPf (d) ≥ LPf (d0 ). With αc = 0 for all c ∈ R, we can naturally extend the optimal solution (α, β) for LPf (d) to an equivalent solution (α, β) of LPf (d0 ). Thus we have LPf (d) = LPf (d)(α, β) = LPf (d0 )(α, β) ≤ LPf (d0 ), since only configurations c ∈ R change their approximation factor fc when moving from d to d0 . Lemmas 18 and 19 together with table 5.2 are sufficient to prove the following theorem. Theorem 20. MGS-LP attains an approximation ratio of at most 44 19 13 ≈ 1.461538. Chapter 6 Conclusions and Discussion In this thesis, we studied the SMTI problem problem with ties restricted to one side. The problem was shown to be NP-hard. We have improved upon the current best known approximation ratio to obtain 19 13 . We have considered what is commonly known as a factor-revealing LP to analyze the algorithm. To our knowledge, this is the first time the idea has been applied to stable matchings. The following questions arise directly from our approach. • Is there an instance for the SMTI problem where we obtain the approximation ratio an instance would show that we cannot hope to do any better than 19 13 19 13 ? Such using GS-LP. • Are there any further constraints based on the structure of augmenting paths that can be used? This would constrain the LP further and could be helpful in obtaining a better upper bound. • Is there a method by which we can round the SMTI LP in order to obtain an approximation ratio closer to 1 + 1e , the integrality gap example shown by Iwama et al. [28]. • Is there a method for achieving stronger bounds based on the SA relaxation? There is still venue for improvement for the SMTI problem with one-sided ties, as the current best known result has an approximation ratio of 19 13 and the best known theoretical bound is 43 . 45 Bibliography [1] Canadian resident matching service. [2] National residency matching program. [3] Scottish foundation allocation scheme. [4] Atila Abdulkadiroglu, Parag Pathak, Alvin Roth, and Tayfun Sönmez. The Boston public school match. American Economic Review Papers and Proceedings, May 2005. [5] Atila Abdulkadiroglu, Parag A. Pathak, Alvin E. Roth, and Tayfun Sonmez. Changing the Boston School Choice Mechanism: Strategy-proofness as Equal Access. 2006. [6] Mourad Baı̈ou and Michel Balinski. Many-to-many matching: Stable polyandrous polygamy (or polygamous polyandry). Discrete Appl. Math., 101(1-3):1–12, April 2000. [7] Mourad Baı̈ou and Michel Balinski. The stable allocation (or ordinal transportation) problem. Math. Oper. Res., 27(3):485–503, August 2002. [8] Benjamin E. Birnbaum and Kenneth J. Goldman. An improved analysis for a greedy remoteclique algorithm using factor-revealing LPs. Algorithmica, 55(1):42–59, 2009. [9] P. Biro. Student admissions in Hungary as Gale and Shapley envisaged. Technical report, University of Glasgow, Department of Computing Science, 2008. [10] Niv Buchbinder, Kamal Jain, and Mohit Singh. Secretary problems and incentives via linear programming. SIGecom Exch., 8(2):6:1–6:5, December 2009. [11] Niv Buchbinder, Kamal Jain, and Mohit Singh. Secretary problems via linear programming. In IPCO, pages 163–176, 2010. [12] Brian C. Dean and Siddharth Munshi. Faster algorithms for stable allocation problems. Algorithmica, 58(1):59–81, 2010. [13] D. Gale and M. de Oliveira Sotomayor. Some remarks on the stable matching problem. Informes de matemática. Série B. Matemática aplicada. Inst. de matemática pura e aplicada, Conselho nacional de desenvolvimento cientı́fico e tecnológico, 1984. [14] D. Gale and L. S. Shapley. College admissions and the stability of marriage. The American Mathematical Monthly, 69(1):9–15, 1962. [15] Gagan Goel and Aranyak Mehta. Online budgeted matching in random input models with applications to adwords. In SODA, pages 982–991, 2008. [16] Gagan Goel and Pushkar Tripathi. Matching with our eyes closed. In FOCS, pages 718–727, 2012. 46 [17] Michel X. Goemans and Jon M. Kleinberg. An improved approximation ratio for the minimum latency problem. Math. Program., 82:111–124, 1998. [18] Dan Gusfield and Robert W. Irving. The Stable Marriage Problem: Structure and Algorithms. MIT Press, Cambridge, MA, USA, 1989. [19] Magnús M. Halldórsson, Kazuo Iwama, Shuichi Miyazaki, and Hiroki Yanagisawa. Improved approximation results for the stable marriage problem. ACM Transactions on Algorithms, 3(3), 2007. [20] Chien-Chung Huang and Telikepalli Kavitha. An improved approximation algorithm for the stable marriage problem with one-sided ties. In IPCO, pages 297–308, 2014. [21] Robert W. Irving. Stable marriage and indifference. In Selected papers of the conference on Combinatorial Optimization, CO89, pages 261–272, 1994. [22] Robert W. Irving, David Manlove, and Gregg O’Malley. Stable marriage with ties and bounded length preference lists. J. Discrete Algorithms, 7(2):213–219, 2009. [23] Kazuo Iwama. Personal communication. 2014. [24] Kazuo Iwama, David Manlove, Shuichi Miyazaki, and Yasufumi Morita. Stable marriage with incomplete lists and ties. In In Proceedings of ICALP 99: the 26th International Colloquium on Automata, Languages and Programming, pages 443–452. Springer-Verlag, 1999. [25] Kazuo Iwama, Shuichi Miyazaki, and Kazuya Okamoto. A (2 - clog n/n)-approximation algorithm for the stable marriage problem. IEICE Transactions, 89-D(8):2380–2387, 2006. [26] Kazuo Iwama, Shuichi Miyazaki, and Naoya Yamauchi. A 1.875: approximation algorithm for the stable marriage problem. In SODA, pages 288–297, 2007. [27] Kazuo Iwama, Shuichi Miyazaki, and Naoya Yamauchi. A (2-c(1/sqrt(n)))-approximation algorithm for the stable marriage problem. Algorithmica, 51(3):342–356, 2008. [28] Kazuo Iwama, Shuichi Miyazaki, and Hiroki Yanagisawa. A 25/17-approximation algorithm for the stable marriage problem with one-sided ties. In ESA (2), pages 135–146, 2010. [29] Kamal Jain, Mohammad Mahdian, Evangelos Markakis, Amin Saberi, and Vijay V. Vazirani. Greedy facility location algorithms analyzed using dual fitting with factor-revealing LP. J. ACM, 50(6):795–824, November 2003. [30] Zoltán Király. Better and simpler approximation algorithms for the stable marriage problem. In ESA, pages 623–634, 2008. [31] Zoltán Király. Linear time local approximation algorithm for maximum stable marriage. Algorithms, 6(3):471–484, 2013. [32] Lap Chi Lau and Mohit Singh. Additive approximation for bounded degree survivable network design. In STOC, pages 759–768, 2008. [33] Mohammad Mahdian, Evangelos Markakis, Amin Saberi, and Vijay V. Vazirani. A greedy facility location algorithm analyzed using dual fitting. In RANDOM-APPROX, pages 127–137, 2001. [34] David F. Manlove, Robert W. Irving, Kazuo Iwama, Shuichi Miyazaki, and Yasufumi Morita. Hard variants of stable marriage. Theor. Comput. Sci., 276(1-2):261–279, April 2002. 47 [35] Eric McDermid. A 3/2-approximation algorithm for general stable marriage. In ICALP (1), pages 689–700, 2009. [36] R. McEliece, E. Rodemich, H. Rumsey, and L. Welch. New upper bounds on the rate of a code via the Delsarte-Macwilliams inequalities. IEEE Trans. Inf. Theor., 23(2):157–166, September 2006. [37] Aranyak Mehta, Amin Saberi, Umesh Vazirani, and Vijay Vazirani. Adwords and generalized online matching. J. ACM, 54(5), October 2007. [38] Katarzyna E. Paluch. Faster and simpler approximation of stable matchings. In WAOA, pages 176–187, 2011. [39] András Radnai. Approximation algorithms for the stable matching problem. Master’s thesis, Eötvös Loránd University, 2014. [40] A.E. Roth and E. Perason. The redesign of the matching market for American physicians: Some engineering aspects of economic design. volume 89, pages 748–780, 1999. [41] Alvin E. Roth, Uriel G. Rothblum, and John H. Vande Vate. Stable matchings, optimal assignments, and linear programming. Math. Oper. Res., 18(4):803–828, November 1993. [42] Mohit Singh. Iterative Methods in Combinatorial Optimization. PhD thesis, Carnegie Mellon University, 2008. [43] Chung-Piaw Teo and Jay Sethuraman. The geometry of fractional stable matchings and its applications. Math. Oper. Res., 23(4):874–891, November 1998. [44] Chung-Piaw Teo, Jay Sethuraman, and Wee-Peng Tan. Gale-Shapley stable marriage problem revisited: Strategic issues and applications. In Proceedings of the 7th International IPCO Conference on Integer Programming and Combinatorial Optimization, pages 429–438, London, UK, UK, 1999. Springer-Verlag. [45] David P. Williamson and David B. Shmoys. The Design of Approximation Algorithms. Cambridge University Press, New York, NY, USA, 1st edition, 2011. [46] H Yanagisawa. Approximation algorithms for stable marriage problems. PhD thesis, Kyoto University, 2007. 48
© Copyright 2026 Paperzz