Algorithms Dynamic Programming LCS (Longest Common Subsequence) abcddeb bdabaed ๐ด๐ = ๐ต๐ 1 + ๐๐๐ก(๐ โ 1, ๐ โ 1) ๐๐๐ก(๐, ๐ โ 1) ๐๐๐ก(๐, ๐) = { ๐ด๐ โ ๐ต๐ max { ๐๐๐ก(๐ โ 1, ๐) ๐ ๐ ๐ ๐ ๐ ๐ ๐ 0 0 0 0 0 0 0 ๐ 1 1 0 0 ๐ 2 1 0 0 ๐ 0 ๐ 0 ๐ 0 ๐ 0 ๐ 0 ๐ Ramsy Theory Erdes and Szekeresh proved that given a permutation of {1, โฆ , ๐}, at least one increasing or decreasing substring exists with length > โ๐. For example, observe the following permutation of {1, โฆ ,10}: 10 5 8 1 4 3 6 2 7 9 (1,1)(1,2)(2,2)(1,3) โฆ The pairs present the number of the longest increasing substring and longest decreasing substring till that point (accordingly). Since every number raises the length of one of the substrings by one (either the increasing or decreasing) each lengths pair is unique Due to this fact, one of the numbers must be at least โ๐ Bandwidth Problem The bandwidth problem is an NP hard problem. Definition: Given a symmetrical matrix, is there a way to switch the columns s.t. the "bandwidth" of the matrix is smaller than a number received as input. Another variant is to find the column switch that produces the smallest bandwidth. The bandwidth of a matrix is defined as the largest difference between numbering of end points. 0 1 1 0 1 1 1 0 1 1 0 0 1 0 0 Special Cases If the graph is a line, it's very easy finding the smallest bandwidth (it's 1 โ need to align the vertices as a line). In case of "caterpillar" graphs however, the problem is already NP hard. ("Caterpillars" are graphs that consist of a line with occasional side lines) Parameterized Complexity Given a parameter k, does the graph have an arrangement with bandwidth โค k? One idea, is to remember the last 6 vertices (including their order!), and simply remember the vertices before and after (without order) so that we know not to repeat a vertex twice. A trivial solution would be to simply remember them all. But while it's feasible to remember ๐ the last 6 vertices, remembering the vertices we've passed takes up (๐ โ ๐ ) where ๐ is the ๐ step index. In case of ๐ = 2 we would need time exponential in ๐. A breakthrough came in the early 80s by Saxe. The main idea is to remember the vertices we've passed implicitly. In order to do so, here are a few observations: 1) Wlog, ๐บ is connected (otherwise we can find the connected sub-graphs in linear time and execute the algorithm independently on each of them, concatenating the results at the end) 2) Maximal degree of the graph (if has bandwidth โค ๐) is โค 2๐. Prefix Active Region Suffix Due to these observations, a vertex is in the prefix, if it is connected to the active region without using dangling edges. A vertex in suffix must use dangling edges to connect to active region. --------end of lesson 1 Color Coding ๐ด(๐ฅ) The measure of complexity would be the expected running time (over random coin tosses of the algorithm) for the worst possible input. Monte Carlo Algorithms There is no guarantee the solution is correct. It is probably correct. Las Vegas Algorithms You want the answer to be always correct. But the running time might differ. Hamiltonian path A simple path that visits all verties of a graph. Logest Simple Path TODO: Draw vertices We are looking for the longest simple path from some vertex ๐ฃ to some vertex ๐ข. Simple path means we are not allowed to repeat a vertex twice. This problem is NP-Hard. Given a parameter ๐ and find a simple path of length ๐. A trivial algorithm would work in time ๐๐ . Algorithm 1 - Take a random permutation over the vertices. - Remove every โbackwardโ edge - Find the longest path in remaining graph using dynamic programming TODO: Draw linear numbering For some permutation, some of the edges go forward, and some go forward. After removing backward edges, we get a DAG. For each vertex ๐ฃ from left to right, record length of longest path that ends at ๐ฃ. Suppose the graph contains a path of length ๐. What is the probability all edges of the path survived the first step? The vertices have ๐! possible permutations, and only one permutation is good for us. So with 1 probability ๐! we will not drop the edges of the path and the algorithm succeeds. This is not too good! If ๐ = 10 it is not too good! So what do we do? Keep running it until it succeeds. 1 The probability of a failure is (1 โ (๐!)), but if we run it more than once it becomes: 1 ๐๐! (1 โ (๐!)) 1 ๐ โ (๐) expected running. Expected running time ๐(๐2 ๐!). ๐๐ log ๐ You can approximate ๐! =โ ๐ ๐ โค ๐, ๐ โ log log ๐ Algorithm 2 - Color the vertices of the graph at random with ๐ colors. - Use dynamic programming to find a colorful path. 1 2 โฆ ๐ ๐ฃ1 ๐1 ๐ฃ1 ๐2 ๐ฃ๐ + ๐๐ โฎ ๐ ๐ฃ๐ ๐ 2 S is the set of colors we used so far. ๐! ๐๐ 1 Probability that all vertices on path of length ๐ have different colors: ๐ ๐ โ ๐ ๐ ๐ ๐ = ๐ ๐ If ๐ โ log ๐ then the success would be 1 . ๐1โฆ Tournaments A tournament is a directed graph in which for every pair of vertices there exists an edge (in one direction). An edge exists if the player (represented by the vertex of origin) won another player (represented by the second vertex). We try to find the most consistent ranking of the players. We want to find a linear order that minimizes the number of upsets. An upset is an edge going in the other direction of another edge (tour?). A Feedback Arc Set in a Tournament (FAST) is all the arcs that go backwards. Another motivation Suppose you want to open a Meta search engine. How can you aggregate the answers given from the different engines? TODO: Draw the rankingโฆ The problem is NP-Hard. k-FAST Find a linear alignment in which there are at most ๐ edges going backwards. ๐ ( ) 0โค๐โค 2 2 We will describe an algorithm that is good for ๐ โค ๐ โค ๐(๐2 ) Suppose there are ๐ก subparts that have the optimal solution and we want to merge them into a bigger solution. If the graph ๐บ is partitioned into ๐ก parts, and we are given the โtrueโ order within each part, merging the parts takes time ๐๐(๐ก) (O just for the bookkeeping). Color Coding Pick ๐ก (๐ ๐๐ข๐๐๐ก๐๐๐ ๐๐ ๐) and color the vertices independently with ๐ก colors. We want that: For the minimum feedback arc set ๐น, every arc of ๐น has endpoints of distinct colors. Denote it as โ๐น is colorfulโ. Why is this desirable? If ๐น is colorful, then for every color class we do know the true order. We use the fact that this is a tournament! The order is unique due to the fact that between every two vertices there is an arc (in some direction). Lets look at two extreme cases: 1) ๐ก > 2๐ - the probability the second vertex gets the first color then the probability is ๐ ๐ก < 2. However, the runtime would be ๐๐(๐) 2) ๐ก < โ๐ - not too good! Intuition: You have a โ๐ vertices where the direction of the arcs is essentially random. So you can create a graph that has very bad chances (didnโt provide a real proof). If ๐ก > 8โ๐ then with good probability โ F is colorful. The probability behaves something 1 like: ๐๐ก. Then the expected running time is ๐๐ก , and with the previous running time the total running time is still ๐๐(๐ก) . ๐ 1 ๐ 1 ๐ก โ (1 โ ) โ ( ) โ ๐ก โ โ๐ ๐ก ๐ 1 โ๐ ( ) ๐ ๐๐(โ๐) Lemma: Every graph with k edges has an ordering of its vertices in which no vertex has more than โ2๐ forward edges. Why is this true? Pick an arbitrary graph with k edges, and start arranging its edges with the lowest degree first. So each time we put the vertex of lower degree of the suffix. deg ๐ =degree of ๐ฃ๐ ๐๐ =number of forward edges of ๐ฃ๐ ๐ =๐โ๐ ๐ โค ๐ - Because a vertex can have one forward edge to every vertex to its right ๐ โค deg ๐ โ ๐ โ ๐ โค deg ๐ โ ๐ โค โ๐>๐ deg ๐ โค โ๐ deg ๐ = 2๐พ Therefore, ๐2 โค 2๐พ โ ๐ โค โ2๐พ. The chance of failure for some vertex is bound by ๐๐ ๐ก (each neighbor has a chance of having ๐ the wrong color). Therefore, the chance of success is at least 1 โ ๐ก๐. However, since ๐๐ is blocked by โ2๐พ, in order to have a valid expression, ๐ก must be larger than โ2๐พ. (1 โ โ (1 โ ๐ -------end of lesson 2 ๐๐ ๐๐ ) โ ๐โ ๐ก ๐ก โ ๐๐ ๐๐ 2๐ ๐๐ ) โค โ ๐ โ ๐ก = ๐ ๐ก = ๐ โ ๐ก = ๐ โโ๐ ๐ก Repeating last weekโs lesson: In every graph with ๐ edges, if we color its vertices at random with โ8๐ colors, then w.p. โฅ (2๐)โโ8๐ the coloring is proper. Assume we color the graph with ๐ก colors. TODO: Draw the example that shows the coloring is dependent. Inductive Coloring Given an arbitrary graph ๐บ, you find the vertex with the least degree in the graph. Then remove that vertex. Then find the next one with the least degree and so onโฆ This determines an ordering of the vertices: ๐ฃ1 , ๐ฃ2 , โฆ , ๐ฃ๐ ๐ฃ๐ - has the least degree in ๐บ(๐ฃ๐ , โฆ , ๐ฃ๐ ). Then we start by coloring the last vertex. Each time we color the vertex according to the vertices to its right (so it will be proper). If ๐ is the maximum right degree of any vertex, then inductive coloring uses at most ๐ + 1 colors. In every planar graph, there is some vertex of degree at most 5. Corollary: Planar graphs can be colored (easily) with 6 colors. Every graph with ๐ edges, has an ordering of the vertices in which all right degrees are at most โ2๐. So we need at least โ2๐ + 1 colors. But we donโt want our chances of success to be too low, so we use twice that number of colors - 2โ2๐. Let the list of degrees to the right - ๐1 , ๐2 , โฆ , ๐๐ . ๐ โ ๐๐ = ๐ ๐=1 What is the probability of violating the proper coloring for vertex ๐? Number of colors left for vertex ๐ - ๐กโ๐๐ ๐ก ๐ก โ ๐๐ ๐๐ โ( ) = โ (1 โ ) ๐ก ๐ก ๐ But we know ๐๐ ๐ก โค โ2๐ โ8๐ = 1 2 Itโs easier to evaluate sums than products. Suppose ๐๐ ๐ก 1 = ๐ (a very small number) ๐ Why is this true: 1โ 2 1 โฅ 2โ๐ ๐ Letโs raise both sides: ๐ ๐ 2 2 1 2 1 (1 โ ) โฅ (2โ๐ ) = ๐ 2 1 1 As long as < , each power only chops off less than half of what remains meaning the left ๐ 2 1 side wonโt for below 2. So the inequality is true. So, back to the original formula: โ 2โ 2๐๐ ๐ก = 2โ ๐ โ ๐ ๐ก ๐ = 2๐ก ๐ Maximum weight independent set in a tree Given a tree. In which, each vertex has a non-negative weight. We need to select an independent set in the tree. In graphs this problem is NP hard, but in trees we can do it in polynomial time. TODO: Add a drawing of a tree We pick an arbitrary vertex ๐ as the root. We think of the vertices as being directed โawayโ from the root. Given a vertex ๐ฃ, denote by ๐(๐ฃ) as the set of vertices reachable from ๐ฃ (in the direction from the root). So for each vertex, we will keep two variables: ๐ + (๐ฃ) โ is the maximum weight independent set in the tree ๐(๐ฃ), that contain ๐ฃ. ๐ โ (๐ฃ) โ is the maximum weight independent set in the tree ๐(๐ฃ), that do not contain ๐ฃ. Need to find ๐ + (๐), ๐ โ (๐) and the answer is the largest of the two. The initialization is trivial. ๐ + (๐ฃ) = ๐ค(๐ฃ) and ๐ โ (๐ฃ) = 0. For every leaf ๐ of ๐, determine ๐ + (๐) = ๐ค(๐), ๐ โ (๐) = 0 and remove it from the tree. Pick a leaf of the remaining tree, with children ๐ข1 , โฆ , ๐ข๐ . ๐ ๐ + (๐ฃ) = ๐ค(๐ฃ) + โ ๐ โ (๐ข๐ ) ๐=1 ๐ ๐ โ (๐ฃ) = 0 + โ max{๐ โ (๐ข๐ ), ๐ + (๐ข๐ )} ๐=1 This algorithm can also work on graphs that are slightly different than trees. (do private calculations for the non-compatible parts). Can we have a theorem of when the graph is just a bit different than a tree and still the algorithm can run in polynomial time? Tree Decomposition of a Graph We have some graph ๐บ, and we want to represent it as a tree ๐. Each node of the tree ๐ would represent a set of vertices of graph ๐บ. Every node of the tree is labeled by a set of vertices of the original graph ๐บ. Denote such sets as bags. We also have the following constraints: 1) Moreover, the union of all these sets is all vertices of ๐บ. 2) Every edge โฉ๐ฃ๐ , ๐ฃ๐ โช in ๐บ is in some bag. 3) For every vertex ๐ฃ โ ๐บ, the bags containing ๐ฃ are connected in ๐. Meaning, that they are connected with vertices that contain ๐ฃ, and do not have to pass through vertices that do not contain ๐ฃ. Given two bags - ๐ต1 and ๐ต2 ๐ . ๐ก. ๐ต1 โ ๐ต2 , they are connected through a single path (because itโs a tree). This path must contain all vertices of ๐ต1 . Tree Width of a Tree Decomposition The Tree width of ๐ is ๐ if the maximal bag size is ๐ + 1. Tree width of ๐บ is the smallest ๐ for which there is a tree decomposition of tree width ๐. Intuitively โ a graph is closer to a tree when its ๐ is smaller. Properties regarding Tree width of graphs Lemma: If ๐บ has tree width ๐, then ๐บ has a vertex of degree at most ๐. Observe a tree decomposition of ๐บ. It has some leaf ๐ฃ. The bag of this leaf has ๐ + 1 vertices at most. It has only one neighbor (since itโs a leaf). Since no bag contains another bag, there is some vertex that exists in its neighbor that is not in ๐ฃ. TODO: Copy the rest Fact: ๐๐(๐บ\๐ฃ) โค ๐๐(๐บ). Since we can always take the original tree decomposition and remove the vertex. Corrolery: If ๐บ has tree width ๐ then ๐บ can be properly colored ๐ + 1 colors. Indicates that if ๐บ is a tree, its tree width is 1 (since a tree is a bi-part graph and therefore can be colored by 2 colors). A graph with no edges has tree width 0, since you can have each bag as a singleton of a vertex. A complete graph on ๐ vertices has ๐๐ = ๐ โ 1 (one bad holding all vertices) A compete graph missing an edge โ โฉ๐ข, ๐ฃโช has ๐๐ = ๐ โ 1: We can construct two bags โ ๐บ โ ๐ข and ๐บ โ ๐ฃ and connect them. Theorem: ๐บ has ๐๐ = 1 iff ๐บ is a tree. Assume ๐บ has ๐๐ = 1. Has a vertex ๐ฃ of degree 1. Remove ๐ฃ. The graph is still a tree! So we can continueโฆ We assume the graph is connected. But it doesnโt have a cycled! If it had a cycle, we would have a contradiction. A connected graph with no cycles is a tree. Assume ๐บ is a tree. Lets construct the decomposition as follows: Lets define each vertex as a bag with two vertices. An edge is connected to all edges that are other edges of the contained vertices. Series-Parallel graphs TODO: Draw resistorsโฆ Series-Parallel graphs are exactly all graphs with ๐๐ = 2. Start from isolated vertices. 1) Add a vertex in series. 2) Add a self loop 3) Add an edge in parallel to an existing edge 4) Subdivide an edge Series-Parallelโ ๐๐(2). TODO: Draw ------ end of lesson 3 Graph Minor A graph ๐ป is a minor of graph ๐บ if ๐ป can be obtained from ๐บ by: (1) Removing vertices (2) Removing edges (3) Contracting edges TODO: Draw graph Definition: A sub-graph is a graph generated by removing edges and vetices Definition: An induced sub-graph is a graph with a subset of the vertices that includes all remaining edges. Contracting an edge is joining the two vertices of the edge together, such that the new vertex has edges to all the vertices the original vertices had. A graph is planar if and only if it does not contain neither ๐พ5 nor ๐พ3,3 as a minor. TODO: Draw the forbidden graphs Definition: A graph is a tree or a forest if it doesnโt contain a cycleโA clique of 3 vertices as a minor. A graph is Series parallel, if it does not contain a ๐4 as a minor. Theorem: There are planar graphs on ๐ vertices with tree width ฮฉ(โ๐) Letโs look at an โ๐ by โ๐ grid โ โ โ โ โ โ โ | | | | โ โ โ โ โ โ โ | | | | โ โ โ โ โ โ โ | | | | โ โ โ โ โ โ โ We will construct โ๐ โ 1 bags. A bag ๐ contains columns ๐ and ๐ + 1 Bags: โ โ โ โ โ โฆ This is a tree decomposition by all the properties. Vertex Separators Vertex Separators: A set of ๐ of vertices in a graph ๐บ is a vertex separator if removing ๐ from ๐บ, the graph ๐บ decomposes into connected components of size at most 2๐ 3 TODO: Draw schematic picture of a graph It means we can partition the connected components into two groups, none of them with more than 2๐ 3 vertices. Every tree has a separator of size 1 Let ๐ be a tree. Pick an arbitrary root ๐. ๐(๐ฃ) =the size of the sub-tree of ๐ฃ (according to ๐). ๐(๐) = ๐. 2 3 2 3 All leaves have size 1. So โ๐ฃ with ๐(๐ฃ) > ๐ and ๐(๐ข) โค ๐ for all children ๐ข of ๐ฃ. That ๐ฃ is the separator. If a graph ๐บ has tree width ๐, then it has a separator of size ๐ + 1. My summary: Let ๐ท be some tree decomposition. Each bag has at most ๐ + 1 vertices. We can now find the separator of ๐ท and consider its ๐ + 1 vertices as the separator of the graph ๐บ. Note that when we calculate ๐(๐ฃ) for some ๐ฃ โ ๐ท, we should count the number of vertices inside the bags below it (not the number of bags). His summary: Consider a tree decomposition ๐ of width ๐. Let ๐ serve as its root. And orient edges of ๐ away from ๐. 2 Pick ๐ to be the lowest bag whose sub-tree contains more than 3 ๐ vertices. Every Separator in the โ๐ by โ๐ grid is of size at least โ๐ 6 Why is this so? Letโs assume otherwise. So there is such a separator. Let ๐ be a set of 5โ๐ 6 โ๐ 6 vertices. rows that do not contain a vertex from ๐. Same for columns. Letโs ignore all vertices in the same row or column with a vertex from ๐. So we ignore at most โ๐ 6 โ โ๐ + โ๐ 6 ๐ โ โ๐ = 3 vertices. Claim: All other vertices are in the same connected component. Since we can walk on the row and column freely (since no members of ๐ share the same row and column). Therefore 2 we have 3 ๐ connected components โ A contradiction. If ๐บ has tree width ๐, then ๐บ is colorable by ๐ + 1 colors. Every planar graph is 5-colorable. Proof: A planar graph has a vertex of degree at most 5. We will use a variation of inductive coloring. Pick the vertex with the smallest degree. Assume we have a vertex with a degree 5. These 5 neighbors cannot form a clique! (since the graph is planar) So one edge is missing. Contract the two vertices of that missing edge with the center vertex (the one with degree 5). Fact: Contraction of edges maintains planarity. This is immediately seen when thinking of bringing the edges closer in the plain. Now we color the new graph (after contraction). Then we give all โcontractedโ vertices the same color in the original graph (before contraction). After the contraction we have degree 4, so we can color it with 5 colors. Hadwiger: Every graph that does not contain ๐พ๐ก as a minor, can be colored with ๐ก โ 1 colors. (A generalization of the 4 colors theorem). If a graph has tree width ๐, then we can find its chromatic number in time ๐(๐(๐) โ ๐) Note: The number of colors is definitely between 1 and ๐ + 1 (we know it can be colored by ๐ + 1 colorsโฆ The question is whether it can be colored with less). For a value of ๐ก โค ๐, is ๐บ ๐ก-colorable? Theorem (without proof): Given a graph of tree width ๐, a tree decomposition with tree width ๐ can be found in time linear in ๐ โ ๐(๐) TODO: Draw the tree decomposition For each bag, we will keep a table of size ๐ก ๐+1 of all possible colorings of its vertices. For each entry in the table, we need to keep one bit that determines whether that coloring is legal with the bags below that bag. We can easily do it for every leaf (0/1 depending on: โis this coloring is legal with sub-graph below bag). A coloring is legal if the sub-tree can be colored such that there is no collision of assignment of colors. A family ๐น of graphs is minor-closed (or closed under taking minors) if whenever ๐บ โ ๐น and ๐ป is a minor of ๐บ, then ๐ป โ ๐น. Characterizing ๐น: 1) By some other property 2) List all graphs in the family (only if the family is finite) 3) List all graphs not in ๐น (if the complementary of the family is finite) 4) List all forbidden minors 5) List a minimal set of forbidden minors For planar graphs, this minimal set was ๐พ5 and ๐พ3,3 A list of minors is a list of graphs: ๐บ1 , ๐บ2 โฆ such that no graph is a minor of the other. A conjecture by Wagner: Minor relation induces a โwell quasi orderโโNo infinite antichain โ A chain of graphs such that no graph is a minor of another graph. So this is always a finite property!!! The conjecture was proved by Robertson and Seymour. ----- End of lesson 4 Greedy Algorithms When we try to solve a combinatorical problem, we have many options: - Exhaustive Search - Dynamic Programming Now we introduce greedy algorithms as a new approach Scheduling theory Matroids Interval Scheduling There are jobs that need to be performed at certain times that take a certain time ๐1 ๐ 1 โ ๐ก1 ๐2 ๐ 2 โ ๐ก2 โฎ We canโt schedule two jobs at the same time! TODO: Draw intervals We can represent the problem as a graph in which two intervals will share an edge if they intersect. Then we will look for the maximal independent set. Such a graph is called interval graph. There are classes of graphs in which we can solve Independent Set in polynomial time. One such family of graphs is Perfect graphs. Algorithm: Sort intervals by earliest ending times and use a greedy algorithm to select the interval that ends sooner. Remove all its intersecting intervals and repeat until no more intervals remain. Why does it work? Proof: Suppose the greedy algorithm picked some intervals ๐1 , ๐2 , โฆ Consider the optimal solution that has the largest common prefix with the greedy one: ๐1 , ๐2 , โฆ Consider first index ๐ such that ๐๐ โ ๐๐ Since ๐๐ ends at the same time (or sooner) than ๐๐ we can generate a new optimal solution that has ๐๐ instead of ๐๐ . Such a solution would still be optimal (same number of intervals) and is legal โ there exists a solution with a larger common prefix, a contradiction! Interval Scheduling 2 ๐ฝ1 โถ ๐ค1 > 0, ๐๐ > 0 ๐ฝ1 โถ ๐ค1 > 0, ๐๐ > 0 ๐ค determines how important the job is. ๐ determines how much time would it take for a CPU to perform the job. Penalty for job ๐ given a particular schedule is the ending time of ๐ × ๐ค๐ Total penalty is โ๐ ๐๐ (๐) Find schedule ๐ than minimizes the total penalty. ๐ฝ1 โฆ ๐ฝ๐ ๐ฝ๐ โฆ Letโs flip some job: ๐ฝ1 , โฆ ๐ฝ๐ ๐ฝ๐ โฆ And suppose that by flipping the order grew. The penalty for the prefix and the suffix stays the same! So we should only consider what happens to the penalty from ๐ฝ๐ , ๐ฝ๐ that resulted the switch ๐ค๐ (๐ก + ๐๐ ) + ๐ค๐ (๐ก + ๐๐ + ๐๐ ) ๐ค๐ (๐ก + ๐๐ ) + ๐ค๐ (๐ก + ๐๐ + ๐๐ ) ๐ค๐ ๐ค๐ ๐ค๐ ๐๐ < ๐ค๐ ๐๐ โ < ๐๐ ๐๐ So this suggests the following algorithm: Schedule jobs in decreasing order of ๐ค๐ ๐๐ This is optimal. Proof: Consider any other schedule which does not respect this order. Then there must be some ๐ in which ๐ค๐ ๐๐ < ๐ค๐ ๐๐ . Then we can reverse the order, and get a better scheduling โ a contradiction! General Definition of Greedy-Algorithm solvable problems โ Matroids ๐ = {๐1 , โฆ , ๐๐ } (sometimes we have matroids in which the items represent edges) ๐น = a collection of subsets of ๐. (Short for โFeasibleโ. Often called โindependent setsโ). We want ๐น to be hereditary. Hereditary - If ๐ โ ๐น, ๐ โ ๐ โ ๐ โ ๐น And we also need a cost function: ๐: ๐ โ ๐ + Find a set ๐ โ ๐น with maximum cost. โ๐๐ โ๐ ๐(๐๐ ) Example: Given a graph ๐บ, the items are the vertices of ๐บ. ๐น is the independent sets of ๐บ, โ๐ฅ. ๐(๐ฅ) = 1. ๐๐ผ๐(๐บ) is NP hard. Definition: A hereditary family ๐น is a matroid, if and only if โ๐, ๐ โ ๐น if |๐| > |๐| then โ๐ โ ๐, ๐ โ ๐ such that ๐ โช {๐} โ ๐น. Proposition: All maximal sets in a matroid have the same cardinality. Suppose |๐| > |๐|, then there should be some item in ๐ that we can add to ๐ to make it feasible, and therefore ๐ does not have the maximal cardinality. All maximal sets is called โbasesโ, and the cardinality of the maximal sets is called the โrank of the matroidโ. We have all sorts of terms: Matroids, Independent Sets, Basis, Rank. How are they related? Consider a matrix. The items are the columns of the matrix. The independent sets are columns that are linearly independent columns. Note that it is hereditary. Because if a set of columns is linearly independent, then any subsets is also linearly independent. The basis of the matrix is the largest set of columns that spans the space. And the rank is also the same as in linear algebra. Theorem: For every hereditary family ๐น, the greedy algorithm finds the maximum solution for every cost function ๐ โ ๐น is a matroid Greedy Algorithm: Iteratively, add to the solution the item ๐๐ with largest cost that still keeps the solution feasible. Proof: Assume that F is not a matroid. So (by definition) there are some sets ๐, ๐ such that |๐| > |๐|, ๐, ๐ โ ๐น and โ๐ โ ๐\๐. ๐ โช {๐} โ ๐น. 1 ๐ โ๐ โ ๐. ๐(๐) = 1 + ๐ ๐ < if ๐ = |๐น| (or something similar) โ๐ โ ๐\๐. ๐(๐) = 1 โ๐ โ ๐ โช ๐. ๐(๐) = ๐ 2 Since the elements of ๐ have the highest cost, they will be selected until the set ๐ is chosen and cannot increase any further. But the optimal solution would be to select the elements of ๐ โ so the greedy algorithm does not solve the problem! Suppose ๐น is a matroid. Since ๐: ๐ โ ๐ + - all weights are positive, the optimal solution is a basis. But likewise, the greedy algorithm is also a basis Sort items in solution in order of decreasing cost. Suppose the items in the greedy solutions are ๐1 , โฆ , ๐๐ where ๐ is the rank of the matroid. And the items in the optimal solution are ๐1 , โฆ , ๐๐ where ๐ is the rank of the matroid. Suppose the maximum prefix is not the entire list of items. Suppose index ๐ is different. So ๐๐ โ ๐๐ but ๐๐โ1 = ๐๐โ1 . But we know that: ๐(๐๐ ) โฅ ๐(๐๐ ) So letโs build another optimal solution. Observe the set: ๐1 , โฆ ๐๐โ1 , ๐๐ Because this is a matroid, we definitely have some item in ๐1 , โฆ , ๐๐โ1 , ๐๐ , โฆ , ๐๐ that can be added and still have an element of ๐น. We can continue doing so until the group is just as large as ๐1 , โฆ , ๐๐โ1 , ๐๐ , โฆ , ๐๐ . But all added items have to be of the set {๐๐ , โฆ , ๐๐ }. But since all items are ordered all of them must have a cost โฅ ๐(๐๐ ) and ๐(๐๐ ) โค ๐(๐๐ ) Greedy works also for general ๐, if the requirement is to find a basis. Graphical Matroid Graph ๐บ. items are edges of ๐บ.๐น - forests of ๐บ. Sets of edges that do not close a cycle. ๐บ is connected โ basesโspanning trees. Greedy algorithm - Finds maximum weight spanning tree. We can also find minimal spanning tree โ Kruskalโs algorithm. ----- End of lesson 5 Matroids โ a hereditary family of sets ๐ โ ๐น, ๐ โ ๐ โ ๐ โ ๐น ๐, ๐ โ ๐น, |๐| > |๐| โ ๐ โ ๐ ๐ . ๐ก. ๐ โช {๐} โ ๐น The natural greedy algorithm, given any cost function: ๐: ๐ โ ๐ + , finds the independent set (members of ๐) of maximum total weight/cost. The maximal sets in ๐ all have the same cardinality, ๐(rank). The maximal sets are called โbasisโ. ๐โฒ โ ๐ And for every ๐ โ ๐, ๐ โ ๐น. ๐ โฒ = ๐ โฉ ๐ โฒ This gives rise to a family ๐น โฒ . If (S, F) is a matroid, then so is (๐ โฒ , ๐น โฒ ) |๐ โฒ | > |๐ โฒ | The rank of the new matroid is ๐ โฒ โ ๐ โฒ โค ๐ Dual of a Matroid Given a matroid (๐, ๐น), itโs dual (๐, ๐น โ ) is the collection of all sets where each ๐ satisfies ๐\๐ still contains a basis for (๐, ๐น) In the graphical matroid: ๐ edges of graph ๐บ. ๐น - forests of graph ๐บ The bases are the spanning trees of the graph. ๐น โ - Any collection of edges that by removing it the graph is still connected. Theorem: The dual ๐น โ of matroid ๐น is a matroid by itself. Moreover, (๐น โ )โ = ๐น. Proof: We need to show that the dual is hereditary โ but this is easy to see. If we remove an item from ๐ฅ, it still doesnโt interfere with the bases of ๐. ๐, ๐ โ ๐น โ , |๐| > |๐| โ๐ โ ๐. ๐ โช {๐ฅ} โ ๐น โ TODO: Draw sets ๐ and ๐. Letโs look at ๐ โฒ = ๐ โ (๐ โช ๐) We know that (๐ โฒ , ๐น โฒ ) is a matroid, and it has rank ๐ โฒ . If ๐ โฒ = ๐ we can move any item from ๐ to ๐ and we will still be in ๐น โ . So the only case we have a problem is when ๐ โฒ < ๐ (it canโt be larger). ๐ตโฒ be a basis for (๐ โฒ , ๐น โฒ ). |๐ต| = ๐ โฒ ๐ โ ๐ โฒ โค |๐\๐| Complete ๐ตโฒ to a basis of ๐น using ๐ต๐ฆ . The number of elements of ๐ต๐ฆ that we use is exactly ๐ โ ๐ โฒ โค ๐\๐ < ๐\๐. So some item wasnโt used in ๐\๐! We can take that item, and add it to ๐ and therefore still get a basis when ๐ plus that item is removed from ๐. (๐น โ )โ = ๐น ? Because the bases of the dual is just the complement of the bases of the original matroid. |๐| = ๐ Also all the bases of the dual are of size ๐ โ ๐. ๐ โ = ๐ โ ๐. How did we find the minimal spanning tree? We sorted all weights by their weight, and added an edge in the spanning tree as long as it doesnโt close a cycle. Minimum weight basis for ๐น=complement of maximum weight basis for ๐น โ . Graphical Matroids on Planar Graphs TODO: Draw planar graphs Every interior (a shape closed by edges) is denoted as a vertex. Then every two vertices are connected if they share a common โsideโ (or edge). The exterior is a single vertex. A minimal cut set in the primal is a cycle in the dual. The dual of a spanning tree is a spanning tree. The complement of a spanning tree in the primal is a spanning tree in the dual. Assume we have a planar graph with ๐ฃ vertices, ๐ faces, and ๐ edges. ๐ฃโ1+๐โ1=๐ We can always triangulate a planar graph, thus increasing the number of vertices but keeping it planar. In such graphs 3๐ = 2๐. 2๐ ๐ฃโ1+ โ1=๐ 3 ๐ ๐ฃโ2= 3 ๐ = 3๐ฃ โ 6 ๐ 2 (๐ฃ) = 6 โ 12 ๐ฃ = 6 minus something โ at least one with 5 or less! (๐, ๐น1 ), (๐, ๐น2 ) We can look at their intersection such that ๐ โ ๐น1 and ๐ โ ๐น2 Partition Matroid Partition ๐ into: ๐1 , ๐2 , โฆ , ๐๐ And every set that contains at most one item from each partition. TODO: Draw stuff A matching is a collection of edges where no two of them touch the same vertex. The intersection of two partition matroids is the set of matchings in bipartite graph. In a bipartite graph the set of matchings are not a matroid. TODO: Draw example graph. Theorem: For every cost function ๐: ๐ โ ๐ +, one can optimize over the intersection of two matroids in polynomial time. Intersections of 3 matroids. TODO: Draw partitions for the matroid. Given a collection of triples โ find a set of maximum cardinality of disjoint triples. This problem is ๐๐-Hard. Things we will see: 1) Algorithm for maximum cardinality matching in bipartite graphs. 2) Algorithm for maximum cardinality matching in arbitrary graphs. 3) Algorithm for maximum weight matching in bipartite graphs. There is also an algorithm for maximum weight matching in arbitrary graphs, but we will not show it in class. Vertex cover โ a set of vertices that cover (touch) all edges. Min vertex cover โฅ maximum matching Min vertex cover โค 2 โmaximal matching. For bipartite graphs minV.C.=max matching. Alternating Paths TODO: Draw graphs Vertices not in our current matching will be called โexposedโ. An alternating path connects two exposed vertices and edges alternates with respect to being in the matching. Given an arbitrary matching ๐ and an alternating path ๐, we can get a larger matching by switching the matching along ๐. Alternating forest: A collection of rooted trees. The roots are the exposed vertices, and the trees are alternating. TODO: Draw alternating forest In a single step: - connect exposed vertices - Add two edges to a tree (some tree) The procedure to extend a tree: Pick an outer vertex, and connect it to - exposed vertex โ alternating path โ extend matching - outer vertex โ different tree. So we can get from a root of some tree to the root of another tree. - Extened the tree (alternating forest) by two edges. Claim: When Iโm stuck I certainly have a maximum matching. Proof by picture. Select the inner vertices as the vertex cover. --- end of lesson 6 Matchings TODO: Draw an alternating forest 1) If you find an edge between exposed vertices then you can just add it to the matching 2) An edge between two outer vertices (in different trees) โ alternating path 3) Edge from outer vertex (the exposed vertices are considered as outer vertices) to an unused matching edge In cases 1 and 2 we restart building a forest Gallai-Edmondโs Decomposition Outer vertices ar denoted as ๐ถ (components) Inner vertices are denoted as ๐ (separator) The rest are denoted as ๐ (Rest) We donโt have edges between ๐ถ and itself otherwise the algorithm would not stop We also donโt have edges between ๐ถ and ๐ otherwise it wouldnโt have stopped In short, we may have edges between ๐ถ and ๐, ๐ can have edges between it and itself, ๐ can have edges between it and ๐ and ๐ can have edges between it and itself. The only choice of matching we used are internal edges of ๐ and edges between C and ๐. All vertices of ๐ are matched, all vertices of ๐ are matched, and the number of vertices of ๐ถ participate in the matching is |๐ถ| โ |๐| Another definition of ๐ถ, ๐ and ๐ : ๐ถ is the set of vertices that are not matched some maximum matching. ๐ is all neighbors of ๐ถ ๐ is the rest The number of vertices not matched in a maximum matching is exactly |๐ถ| โ |๐|. General Graph In general graphs we might have odd cycles. So case 2 doesnโt work anymore (because an edge can exist in the same tree) We must have a new case: 2a) An edge between two outer vertices of the same tree. In this case we get an odd cycle. Contract the odd cycle! The contracted vertex is an outer vertex. Suppose we had a graph ๐บ, and now we have ๐บ โฒ as the contracted graph. First note the following: Size of matching in ๐บ โฒ is equal to size of matching in ๐บ โ ๐ if the odd cycle had length 2๐ + 1. Lift matching from ๐บ โฒ to ๐บ and restart building a forest (instead of just restart building a forest). Now we can see why we call the outer vertices ๐ถ - since they might represent components (contracted odd cycles) Can be see that each component always represents an odd number of vertices. Now, we might have components that might have edges to themselves, but not between two components. So the separator really separates the different components from each other and ๐. Vertices of ๐ are all matched. Components in ๐ถ are either connected to some vertex in ๐ or none (at most connected to one vertex in ๐) But then this is the optimal matching! Which means that the algorithm is correct. So the algorithm finds a matching that misses |๐ถ| โ |๐|. But this is the minimal number of edges we miss by any covering โ so our solution is the optimal one. Theorem: If in a graph there is a set ๐ of vertices where removal leaves in the graph ๐ก components of odd size (and any number of components of even size, then the size of the maximum matching is at most ๐โ(๐กโ|๐|) 2 Minimum Weight Perfect Matching in Bipartite Graphs Weights are non-negative, and we try to find the perfect matching with the maximal weight. Letโs observe the variant where we search for a maximal weight. If an edge is missing, we can add zeroes instead. Then a perfect matching always exist. And then we can apply a transformation on the maximal variant to make it a minimal variant. So the problem is well defined for non-perfect graphs. We will use: - Weight Shifting - Primal-Dual method In our case the primal problem โ minimum weight covering of vertices by edges. But a perfect matching is a covering problem with minimal weight. The dual problem is the packing problem: Assign non-negative weights to the vertices such that for every edge, sum of weights of its endpoints is at most weight the edge. Maximize sum of weights of vertices. TODO: Draw a bipartite graph The primal is a minimization problem, so itโs at least as large as the dual. For optimal solution there is equality! We will show thatโฆ This is a theorem of Evergary of 1931 Letโs try to reduce the problem to an easier problem. Let ๐ค๐ the smallest weight of an edge Subtract ๐ค๐ from the weight of every edge. Every perfect matching contains exactly ๐ โ ๐ค๐ from its weight. As for the dual problem, we can start with the weight of every vertex as 0, and increase the weight of every vertex by ๐ค๐ . 2 The dual we got is a feasible dual, and we decreased the weight of the edges. Let ๐ฃ be some vertex. We can subtract ๐ค๐ฃ from all of its edges. Since one of them has to be in the final perfect matching, the perfect matching lost exactly ๐ค๐ฃ And in the dual, we can increase the weight of ๐ฃ by ๐ค๐ฃ We will keep the following claims true: - For every edge, the amount of weight it loses is exactly equal to the number of weight gained by its endpoints. - At any state, edges have non-negative weights. Consider only edges of zero weight and find maximum matching. There are two possibilities now: - It is a perfect matching. In this case it is optimal. - Itโs not a perfect matching. We can observe the Galai-Edmond decomposition of the graph. Let ๐ be the minimum weight of an edge between ๐ถ and ๐ or an edge of weight 2๐ between ๐ถ and ๐ถ. For every vertex in ๐ถ, we will add to its weight +๐ For every vertex is ๐ we will reduce its weight by ๐ With respect to the matching, we did not change the weight. But we created one more zero weight edge! (Note that no edges became negative) Either we increased the matching (for an edge between ๐ถ and ๐ถ) Or we increased ๐ (for an edge between ๐ถ and ๐ , since the new connected component of ๐ now belongs to ๐ since a neighbor of some vertex in ๐ถ) Every time we make progress. So in ๐(๐2 ) weight shifting steps, get a perfect matching of weight zero. --- end of lesson 7 Algebraic Methods for Solving Algorithms Search of Triangles By using exhaustive search, we can go over all triples and do it in ๐(๐3 ) The question is, can we do it better? Multipication of Two Complex Numbers (๐ + ๐พ๐)(๐ + ๐พ๐) = ๐๐ โ ๐๐ + ๐พ(๐๐ + ๐๐) Assume the numbers are large! Multiplications are rather expensive. Much more then addition and subtractions. Letโs compute ๐๐, ๐๐, and then compute: ๐ + ๐ and then ๐ + ๐ And then calculate: (๐ + ๐)(๐ + ๐) = ๐๐ + ๐๐ + ๐๐ + ๐๐ But then we can use the previous values to extract ๐๐ + ๐๐ So the naïve way uses 4 products and 2 additions/subtractions. The new way uses 3 products and 5 additions/subtractions. Fast Matrix Multiplication Suppose we have a matrix: ๐11 ๐12 โฆ ๐1๐ โฎ โฎ ๐ด=[ โฎ ๐ต = (๐๐๐ ), ๐ = ๐ด๐ต = (๐๐๐ ), ๐๐๐ = โ ๐๐๐ ๐๐๐ โฎ ], ๐ ๐๐1 โฆ โฆ ๐๐๐ So we need ๐3 products to compute ๐ Unlike the previous problem, we want to reduce both the number of products and the number of additions. We will show a very known algorithm by Strassen: Assume for simplicity that ๐ is a power of 2. If not we can pad it with zeroes to the next power of 2. Letโs partition ๐ด and ๐ต into blocks: ๐ด ๐ด12 ๐ต ๐ต12 ๐ด = [ 11 ๐ต = [ 11 ], ] ๐ด21 ๐ด22 ๐ต21 ๐ต22 And then: ๐ด ๐ต + ๐ด12 ๐ต21 ๐ = [ 11 11 ๐ด11 ๐ต12 + ๐ด12 ๐ต22 ๐ด21 ๐ต11 + ๐ด22 ๐ต21 ๐ ] = [ 11 ๐ด21 ๐ต12 + ๐ด22 ๐ต22 ๐21 ๐12 ] ๐22 ๐ 2 So ๐(๐) = 8๐ ( ) + ๐(๐2 ) After solving the recursion, we get: ๐(๐) = ๐(8log ๐ ) = ๐3 If we could compute everything by 7 multiplication (instead of 8) the time will be: ๐ ๐(๐) = 7๐ ( ) + ๐(๐2 ) = ๐(7log ๐ ) = ๐(๐log 7 ) = (๐2.8 ) 2 ๐ด11 ๐ด12 ๐ด21 ๐ด22 ๐ต11 ๐11 ๐ต12 ๐12 ๐21 ๐ต21 ๐ต22 ๐11 ๐12 ๐21 ๐22 ๐22 Products: For example: 0 + + + 0 + + โ 0 โ โ 0 0 0 0 0 0 0 0 0 0 0 0 0 What we want: + 0 0 0 0 0 + 0 0 0 0 0 0 0 0 0 If you think of pluses and minuses as +1 and -1 then we have a matrix of rank 2. As we can see, every multiplication gives a matrix of rank 1! So we must use 7 matrices of rank 1, and generate 8 matrices of rank 2. + + 0 0 ๐1 = [ + + 0 0 0 0 ๐2 = [ 0 0 0 0 0 0 0 0 ] = (๐ด11 + ๐ด21 )(๐ต11 + ๐ต12 ) 0 0 0 0 0 0 + + ] = (๐ด12 + ๐ด22 )(๐ต21 + ๐ต22 ) 0 0 0 0 + + We can observe that ๐1 + ๐2 = ๐11 + ๐12 + ๐21 + ๐22 In other words ๐22 = ๐1 + ๐2 โ ๐11 โ ๐12 โ ๐21 So we used two products and we can find one expression. We only need to calculate these three expressions. 0 0 0 0 0 0 0 0 ๐3 = [ ] = ๐ด21 (๐ต11 + ๐ต21 ) + 0 + 0 0 0 0 0 0 0 0 0 0 0 0 0 ๐4 = [ ] = (๐ด21 + ๐ด22 )๐ต21 0 0 โ 0 0 0 + 0 So we have that ๐21 = ๐3 + ๐4 0 0 ๐5 = [ 0 0 0 0 ๐6 = [ 0 0 0 0 0 0+ 0 + ] = ๐ด12 (๐ต12 + ๐ต22 ) 0 0 0 0 0 0 + 0 0 โ 0 0 ] = (๐ด11 โ ๐ด12 )๐ต12 0 0 0 0 0 0 So we already used 6 products, and generated ๐12 , ๐21 and are able to compute ๐22 later. We are only missing ๐11 and only one multiplication left! Can we do it??? (the tension!) 0 0 0 0 0 โ + 0 ๐7 = [ ] 0 โ + 0 0 0 0 0 ๐11 = ๐7 + ๐1 โ ๐3 โ ๐6 We did it!! The world was saved! But you can even multiply matrices faster! If you divide the matrix into 70 parts you can find algorithms that solve it with ๐2.79 The best known result is ๐2.37. Often people specify that matrices can be multiplied with ๐(๐๐ ) and use it as a subroutine. The final lower bound is ๐2 Multiplying Big Numbers Multiplying numbers of size ๐ takes ๐(๐2 ) by using the naïve approach. You can break every number into two and apply an approach similar to the one used when multiplying complex numbers. The best running time is something like: ๐ โ log ๐ โ log log ๐โฆ Testing How can you check that ๐ด โ ๐ต = ๐ is correct? Probabilistic test: ๐ฅ โ๐ โ 0011001 โฆ ๐ If you got the right answer: ๐ดโ๐ตโ๐ฅ =๐โ๐ฅ But this is rather easy! A multiplication by a vector takes ๐(๐2 ) time, and we can use associativity on the left to multiply ๐ฅ by ๐ต first. Suppose the true answer is ๐, but we got ๐โฒ So with a random vector, thereโs a good probability that ๐๐ฅ โ ๐โฒ ๐ฅ โฒ Since they are different, there must be that ๐๐๐ โ ๐๐๐ The chance is exactly ½ that we catch the bad matrix. The reason is too long for the shulaim of this sikum. The inverse of a matrix can also be calculated using matrix multiplication, so finding an inverse takes ๐(๐๐ ) as well. Boolean Matrix Multiplication There are settings in which subtraction is not allowed. One such setting is Boolean matrix multiplication. In this case, multiplication is an AND operation and addition is an OR operation. So itโs clear why we canโt subtract. In this case, we can replace OR with addition and AND with a regular multiplication then: โฒ ๐๐๐ = 0 โ ๐๐๐ = 0 โฒ ๐๐๐ > 0 โ ๐๐๐ = 1 If you donโt like big numbers, you can use everything under mod ๐ of some prime variable But we can think of worst cases such that things stop working. Such as when the OR is replaced by XOR. Back to Search of Triangles ๐ฃ1 โฆ โฆ ๐ฃ๐ โฎ [โฎ ] ๐ฃ๐ ๐ด๐๐ = 1 โ (๐, ๐) โ ๐ธ Otherwise itโs zero. Letโs look at ๐ด2 ๐ด2๐๐ = 1 if thereโs a path of length 2 between vertex ๐ and ๐ So ๐ด2 is a matrix of the number of paths of size at most 2. ๐ด๐ก = # length ๐ก paths between ๐, ๐ So ๐ด3๐๐ is the number of paths of size 3 from ๐ to itself. But every such path is a triangle! So ๐ก๐๐๐๐(๐ด3 ) = # triangles in G times 6 (since for every triangle, there are 2 possible paths from each of the vertices to itself) But you can calculate ๐ด3 in ๐(๐๐ )! ๐ด๐ก๐๐ = 1 โ there is a path of length at most ๐ก from ๐ to ๐. So ๐ด๐ actually โhashesโ all reachability of a graph! So by performing log ๐ โ ๐(๐๐ ) operations we can find the closure of the graph. --- end of lesson 8 Matrices and Permanents Permanent ๐๐๐(๐ด) = โ๐ โ๐๐=1 ๐๐,๐(๐) Determinant det(๐ด) = โ๐(โ1)๐ โ๐๐=1 ๐๐,๐(๐) det(๐ด) = 0 โ ๐ด is singular Valiant showed that: Computing the permanent is #๐ complete. So ๐๐ hard. Even for 0,1 matrices. TODO: Draw a neighbor matrix of a bipartite graph. Permanent of a 0/1 matrix ๐ด is the same as the number of perfect matchings in the associated bipartite graph. ๐๐๐(๐ด) (๐๐๐ 2) = det(๐ด) (๐๐๐ 2) (because subtraction and addition are the same!) ๐๐๐(๐ด) (๐๐๐ 3) Suppose ๐ = 30 30 30 ๐ The number of permutations is 30! โ ( ) There is an algorithm that is still exponential but substantially faster. It was suggested by Ryser, and its running time is around 2๐ . So in the case of 30, itโs 230 which is not so bad. The trick is to use the โInclusion-Exclusionโ principle. ๐ × ๐ matrix ๐ด ๐ non-empty set of columns. ๐ ๐ (๐) โ Sum of entries in row ๐ and columns from ๐. ๐ ๐โ|๐| ๐๐๐(๐ด) = โ(โ1) ๐ โ ๐ ๐ (๐) ๐=1 Since we go over all ๐, we have 2๐ terms. The running time is something around 2๐ times some polynomial. ๐ฅ11 ๐๐๐ ( โฎ ๐ฅ๐1 โฆ ๐ฅ1๐ โฑ โฎ ) = โ โ ๐ฅ๐,๐(๐) = โฆ ๐ฅ๐ ๐ ๐ Multilinear polynomial. ๐! monomials. Each monomial is a product of variables. We have ๐2 variables. This applies both to the determinant and the permanent. In Ryzerโs formula we only get monomials with ๐ variables โ 1 from each row. In the original definition we also get monomials with ๐ variables โ 1 from each row. But each of them must be from a different column. First letโs see that all the terms that should be in Ryzerโs formula are actually there. โ๐๐=1 ๐ฅ๐,๐(๐) - only for ๐ = all columns. This is a monomial that includes 5 variables but only 3 columns: ๐ฅ11 ๐ฅ21 ๐ฅ33 ๐ฅ32 ๐ฅ55 ๐ variables, ๐ columns โ {1,3,5} ๐โ๐ The number of minus signs: โ๐|๐โ๐(โ1)๐โ|๐| โ 1 ๐๐๐๐ = 0 TODO: Draw many partially drawn matrices. For every variable independently, substitute a random value {0, โฆ , ๐ โ 1} ๐ > ๐2 , ๐ is prime. Then compute the determinant. And we can even do the computations modulo ๐. Lemma: For every multilinear polynomial in ๐2 variables, if one substitutes random values from 0, โฆ , ๐ โ 1 and computes modulo prime ๐, the answer is 0 with probability at most ๐2 . ๐ 1 ๐๐ฅ + ๐ = 0 (๐๐๐ ๐) the probability is ๐ ๐ Suppose the probability is ๐ for ๐ variables. We want to show itโs ๐+1 for ๐ ๐ + 1 variables. We can look at the new polynomial as ๐(๐ฅ1 , โฆ , ๐ฅ๐+1 ) ๐ฅ๐+1 (๐1 (๐ฅ1 , โฆ , ๐ฅ๐ ) + ๐2 (๐ฅ1 , โฆ , ๐ฅ๐ )) ๐ ๐ If ๐1 and ๐2 are zero, this happens in probability (by induction) and otherwise, thereโs only one possible vale of ๐ฅ๐+1 such that everything is zero. So the total probability is (๐+1) ๐ Kastaleyn: Computing the number of matchings in planar graphs is in polynomial graphs. Kirchoff: Matrix-tree theorem. Counts the number of spanning trees in a graph. We have a graph ๐บand we want the number of spanning trees. Laplacian of ๐บ โ ๐ฟ. ๐ฃ1 โฎ ๐ฃ๐ ๐ฃ1 ๐1 0 โฆ โ1 โฑ ๐ฃ๐ ๐๐ ๐๐,๐ = ๐๐,๐ = โ1 โ (โฉ๐, ๐โช โ ๐ธ) ๐๐,๐ = degree of vertex ๐ TODO: Draw the graph and its corresponding matrix (though we got the point) The algorithm: Generate the Laplacian matrix of the graph. Cross some row and column (same column) and calculate the determinant. This is the number of spanning trees of the graph. Given ๐บ, Direct its edges arbitrarily. Create the incidence matrix of the graph. ๐ฃ1 ๐ฃ2 ๐ฃ3 ๐ฃ4 ๐1 ๐2 ๐3 ๐4 +1 0 0 0 โ1 โ1 +1 0 0 0 โ1 โ1 0 +1 0 +1 For each edge you put a +1 for the vertex it is entering and โ1 for the vertex it leaves. Denote this matrix ๐. If we multiply ๐ โ ๐๐ we get the Laplacian. The incidence matrix has special properties. One of the properties is its โTotally Unimodulerโ. Totally unimodular: Every square sub-matrix has determinant either +1, โ1 or 0. Theorem: Every matrix in which every column has at most a single +1, at most a single โ1 and rest of the entries are zero, is totally unimodular. --- end of lesson Given a graph ๐บ we have its Laplacian denoted ๐ฟ. Also ๐ is the vertex-edge incidence matrix. We know that ๐๐๐ = ๐ฟ Another thing we said about ๐ is that itโs totally unimodular. Which means that every square submatrix has determinant either 0,1 or +1. Wlog, we always look at the case where ๐ โฅ ๐ (๐ is the number of edges, ๐ is the number of vertices). If the number of edges is smaller than the number of vertices, either we can use all edges as a spanning tree or we donโt have a spanning tree. ๐๐๐๐(๐) โค ๐ โ 1 Letโs observe the matrix transposed: TODO: Draw a transposed incidence matrix. If we take ๐ฅ = [1 Then ๐ฅ โ ker ๐ โฆ 1] What do we know about a submatrix with a subset of the edges (but all the vertices) If ๐ is a spanning tree, then the rank is ๐ โ 1. Otherwise, then ๐๐๐๐ < ๐ โ 1 ๐ ๐ det(๐๐ โ๐ โ ๐๐ โ๐ ) = det(๐๐ โ๐ ) โ det(๐๐ โ๐ ) = 1 if ๐ is a spanning tree and 0 otherwise. Denote ๐๐ โ๐ = ๐๐โฒ So the number of spanning trees = โ๐ det ๐๐โฒ โ det ๐๐โฒ Binet-Cauchy Formula for computing a determinant: Let ๐ด and ๐ต be two matrices. Say ๐ด is ๐ × ๐ and ๐ต is ๐ × ๐, ๐ > ๐. Then det ๐ด โ ๐ต๐ = โ๐(det ๐ด๐ )(det ๐ต๐ ) (๐ ranges over all subsets of ๐ columns) If ๐ = ๐ then det(๐ด โ ๐ต) = (det ๐ด) โ (det ๐ต) If ๐ > ๐ the determinant is zero. So itโs not interesting. If ๐ > ๐ Set ๐ด = ๐ต = ๐โ๐ . Then everything comes out. Prove the Binet-Cauchy formula: x1 0 0 ฮ = [0 โฑ 0 ] n×n 0 0 ๐ฅ๐ matrix det(๐ดฮ๐ต๐ ) = โ(det ๐ด) โ (det BS ) โ (โ ๐ฅ๐ ) ๐ ๐โ๐ We will prove this, and this is a stronger statement then the one we need to prove. The answer ๐ดฮ๐ต๐ is an ๐ × ๐ matrix: Every entry in the matrix is a linear polynomial in the variable ๐ฅ๐ . Linear because we never multiply an ๐ฅ๐ in ๐ฅ๐ . In addition we donโt have any constant terms. What is the determinant? det(๐ดฮ๐ต๐ ) is a homogenous polynomial of degree ๐. We can take any monomial of degree ๐, and see that the coefficients in both cases are the same. On the right hand side we donโt have any monomials of degree higher than ๐ or lower than ๐. We need to prove they are zero on the left side. ๐ = a set of less then ๐ variables. Substitute 0 for all variables which are not in ๐. ๐ดฮโฒ ๐ต๐ BAAAAHHH. Canโt write anything in this class! Spectral Graph Theory NP-Hard problems on random instances. Like 3๐๐ด๐, Hamiltonicity k-clique, etcโฆ We sometimes want to look at max-clique (which is the optimization version of k-clique). We can also look at the optimization version of MAX-3SAT. 3XOR or 3LIN: (๐ฅ1 โ ฬ ฬ ฬ ๐ฅ2 โ ๐ฅ3 ) โฆ This problem is generally in ๐. But if we look for the maximal one we get an NP-Hard problem. Motivations: - Average case (good news) - Develop algorithmic tools - Cryptography - Properties of random objects Heuristics: 3SAT: โyesโ โ Find a satisfying assignment in many cases (yes/maybe). Refutation/โnoโ โ find a โwitnessโ that guarantees that the formula is not satisfyable. (no/maybe). Hamiltonicity ๐บ๐,๐ - Esdes-Rengi random graph model: Between any two vertices independently, place an edge with probability ๐. ๐บ๐,๐= 5 . The ๐ doesnโt have to be a constant. ๐ Process of putting edges in the graph at random one by one. 3SAT: At first, a short list of clauses is satisfyable. But as the formula becomes larger, it is not so sure that it is satisfyable. What is that point? A conjecture is t 4.3 โ ๐. There is a proof of 3.5 โ ๐. However, there is a proof that this is a sharp transition. For refutation, the condition is exactly opposite. The longer the formula, the easier it is to find a witness for โnoโ. --- end of lesson Adjacency matrix: 0 1 [ ] ๐๐,๐ = 1 โ (๐, ๐) โ ๐ธ โฑ 1 0 For regular graphs, there is no difference between the adjacency matrix and the laplacian. For irregular graphs, we do have differences and we will not get into it. Properties: Non-negative, symmetric, connected โ irreducible Irreducible means we cannot represent it as a block matrix such that the upper right block is zeroes. ๐1 โฅ ๐2 โฅ โฏ โฅ ๐๐ are all real, we might have multiplicities so we also allow equality. ๐ฃ1 , โฆ , ๐ฃ๐ eigenvectors. If we take two distinct eigenvalues ๐๐ โ ๐๐ then ๐ฃ๐ โฅ ๐ฃ๐ โ๐ has an orthogonal basis using eigenvectors of ๐ด. If we look at the eigenvector that corresponds to the largest eigenvalue, then the eigenvector is positive (all its elements are positive). ๐1 > ๐2 , ๐1 โฅ |๐๐ | (The last 2 properties are only true if the graph is connected) ๐ฅ is an eigenvector with eigenvalue ๐ if ๐ด๐ฅ = ๐๐ฅ In graphs, if we have the adjacency matrix, and we multiply it by ๐ฅ. The values of ๐ฅ corresponds to values we give to some of the vectices of ๐ด. TODO: Draw graph. If some ๐ฅ is an eigenvector of ๐ด, it means every vertex is given a value that is a multiple of the values of its neighbors. ๐ก๐๐๐๐๐ด = 0 โ ๐๐ = ๐ก๐๐๐๐ Since ๐1 > 0 โ ๐๐ < 0 Bipartite Graphs Let ๐ฅ be an eigenvector with non-zero eigenvalue ๐. TODO: Draw bipartite graph By flipping the value of ๐ฅ on one site of the bipartite graph, we get a new eigenvector and it has eigenvalue โ ๐. This is a property only of bipartite graphs (as long as weโre dealing with connected graphs). Consider an eigenvector ๐ฅ. Then observe: ๐ฅ ๐ ๐ด๐ฅ ๐ฅ๐๐ฅ This is called Rayleigh quotients If ๐ฅ is an eigenvector of eigenvalue ๐ โ ๐ฅ ๐ ๐๐ฅ ๐ฅ๐๐ฅ = ๐ โ =๐ ๐ฅ๐๐ฅ ๐ฅ๐๐ฅ Let ๐ฃ1 , โฆ , ๐ฃ๐ be an orthonormal eigenvector basis of โ๐ . In this case: ๐ฅ = ๐1 ๐ฃ1 + โฏ + ๐๐ ๐ฃ๐ where ๐๐ = โฉ๐ฅ, ๐ฃ๐ โช ๐ด (โ ๐๐ ๐ฃ๐ ) = โ ๐๐ ๐๐ ๐ฃ๐ โ ๐๐2 ๐๐ โ ๐๐2 This is like a weighted average where every ๐๐ gets the weight ๐๐2 So this expression cannot be larger than ๐1 and cannot be smaller than ๐๐ ๐ (โ ๐๐ ๐๐ ) (โ ๐๐ ๐๐ ๐ฃ๐ ) = Suppose we know that ๐ฅ โฅ ๐ฃ1 ? It means that the result is a weighted average of all vectors except for ๐1 Another way of getting the same thing: โ๐,๐ ๐๐๐ ๐ฅ1 ๐ฅ๐ (element-wise multiplication of the matrix ๐ด and the matrix ๐ฅ โ ๐ฅ ๐ . Large max cut โ ๐๐ is close to โ๐1 Can show it using rayleighโs quitients. But the interesting thing is that the opposite direction is not so true! We looked at the relation between ๐1 and ๐๐ . Now letโs look at the relation between ๐1 and ๐2 . If a graph is ๐-regular all 1 eigenvalue. If we have a disconnected d-regular graph, ๐1 , ๐2 are equal! Since 111111 โฆ is an eigenvector of both ๐1 and ๐2 . So if they are equal, the graph is disconnected! What happens if they are close to being equal? ๐2 close to ๐1 , then ๐บ close to being disconnected! Suppose we have a graph. TODO: Draw the graph we are supposed to have. We perform a random walk on a graph. What is the mixing time? If a token starts at a certain point, what is the probability it will be in any of the other vertices? If it turns quickly into a uniform distribution then the mixing is good. Small cuts are obstacles of fast mixing. If we started on the first vertex, we can say we started with the (probabilities) vector: [1 0 โฆ 0] ๐ด โ ๐๐ ๐ฃ๐ ๐ ๐ด๐ก If I do ๐ก steps of the random walk itโs: ๐๐ก (โ ๐๐ ๐ฃ๐ ) = โ ๐๐ (๐๐ )๐ก ๐ฅ๐ ๐๐ก If ๐๐ โซ |๐๐ | then we get a uniform distribution over all vertices. ๐๐ for non-regular graphs Suppose the graph has maximum degree ฮ and average degree ๐. So we know ๐1 โค ฮ But this is always true: ๐1 โฅ ๐ ๐ฅ ๐ = (1,1, โฆ 1) and consider ๐ฅ ๐ ๐ด๐ฅ ๐ฅ๐๐ฅ So: ๐ฅ ๐ ๐ด๐ฅ ๐ฅ๐๐ฅ = sum over all entries of ๐ด = 2๐ ๐ โฅ average degree. c-core of a graph is the largest subgraph in which every vertex has degree โฅ ๐. Let ๐ โ be the largest ๐ for which the graph ๐ has a c-core. At ๐-regular graphs ๐ โ = ๐. ๐1 โค ๐ โ ๐ฃ is the eigenvector associated with ๐1 . Sort vertices by their value in the vector ๐ฃ. So ๐ฃ1 is the largest value and ๐ฃ๐ is the smallest. And all numbers are positive (according to the provinious bla bla theorem). The graph cannot have a ๐1 + 1-core. (a vertex cannot be connected to ๐1 neighbors above it, since then it cannot equal to a multiple ๐1 of the values of its neighbors) If we look at a star graph ฮ = ๐โ1 ๐ = 2 (a bit less than) And the largest eigenvalue ๐1 = โ๐ โ 1 We can give the center โ๐ โ 1 and the rest 1. Sometimes when we have a graph that is nearly regular but have a small number of vertices with a high degree, we should remove the vertices of high degree since they skew the spectrum! Letโs use ๐ผ(๐บ) to denote the size of the maximum independent set in the graph. And letโs assume that ๐บ is ๐-regular. ๐ผ(๐บ) โค โ ๐๐๐ ๐ โ ๐๐ ๐บ is bipartite. Then ๐๐ = โ๐1 and then ๐ผ(๐บ) = Proof: ๐๐ โค ๐ฅ ๐ ๐ด๐ฅ ๐ฅ๐๐ฅ ๐๐1 2๐1 ๐ =2 โ๐ฅ Let ๐ be the largest independent set in ๐บ. TODO: Draw ๐ฅ So the entries of ๐ will equal ๐ โ ๐ and the other vertices will be โ ๐ 0 0 | ๐ โ ๐ 1โฒ ๐ | โ๐ (๐ โ ๐ ) 0 0 | | โ โ + โ โ โ โ + โ โ = ๐ โ ๐ 1โฒ ๐ | ๐ โ (๐ โ 2๐ ) 1โฒ ๐ โ๐ (๐ โ ๐ ) | ๐ 2 | ] | [ ] [ ๐ 2 (๐๐ โ 2๐๐ ) โ 2๐ (๐ โ ๐ )๐๐ ๐ 2 ๐๐ โ 2๐๐ 3 โ 2๐ 2 ๐๐ + 2๐๐ 3 ๐ ๐ ๐1 โค = =โ ๐ (๐ โ ๐ )2 + (๐ โ ๐ )๐ 2 ๐โ๐ ๐ (๐ โ ๐ )((๐ โ ๐ ) + ๐ ) (๐ โ ๐ )2 ๐๐ โ ๐ โ ๐๐ ๐ โค โ๐๐ ๐ (๐ โ ๐๐ ) โค โ๐๐ ๐ โ๐๐ ๐ ๐ โค ๐ โ ๐๐ If |๐๐ | โช ๐ we get a very good bound. Otherwise we donโt. โ ๐๐ = 0 We can also look at the trace of ๐ด2 = โ(๐๐ )2 On the other hand, it is also โ ๐๐ Only for regular graphs, this is ๐ โ ๐ So it follows that the average square value: ๐ด๐ฃ๐(๐2๐ ) = ๐ ๐ด๐ฃ๐|๐๐ | โค โ๐ Recall that ๐1 = ๐! It turns out that in random graphs (regular or nearly regular). With high probability โ๐ โ 1 โ |๐๐ | = ๐(โ๐) In almost all graphs, |๐1 | โซ |๐๐ | โ๐ ๐ 2โ๐๐ Therefore: ๐ โค ๐โ๐๐ โค ๐+2โ๐ โค ๐ 2๐ โ๐
© Copyright 2024 Paperzz