Heuristics, Experimental Subjects, and Treatment Evaluation in Bigraph Crossing Minimization Matthias Stallmann, Franc Brglez, and Debabrata Ghosh North Carolina State University The bigraph crossing problem, embedding the two node sets of a bipartite graph G = (V0 , V1 , E) along two parallel lines so that edge crossings are minimized, has application to circuit layout and graph drawing. Our experimental study looks at various heuristics, input properties, and data analysis approaches. Among our main findings are the following. —The most dramatic differences among heuristics show themselves on trees. As graphs, particularly random ones, become more dense, differences are much less clear. The densest graphs in our study, which have average node degree of 16, produce no noticeable difference in crossing numbers among leading heuristics. —Among graphs with roughly the same density, ones that are structured — we consider butterflies, meshes, hypercubes, graphs motivated by circuits, and minimum spanning trees of randomly selected points on the plane — tend to benefit most from initial placement (breadth-first or depth-first search) while ones that are more random respond more to iterative improvement (for example, repeated application of the barycenter heuristic). —The best heuristics combine good initial placement with iterative improvement. One such heuristic, our newly devised tr17, ranks first or second among heuristics of reasonable efficiency for most classes of graphs. —The widely-used dot heuristic, ∗ also a combination of initial placement and iterative improvement, performs poorly relative to simpler, more efficient heuristics, especially as graphs become larger. The best results for dot are obtained on random trees, trees motivated by VLSI circuits, meshes, and hypercubes. —Minimum spanning trees on random points within narrow rectangles provide experimental subjects whose minimum number of crossings approach 0 in a controllable fashion Source code, scripts for generating data, running heuristics, and analyzing results, and random number seeds used to generate all our data are available on the world-wide web at URL www.cbl.ncsu.edu/software/2001-JEA-Software/. The research of F. Brglez tor Research Corporation EL/DAAH04–94–G–2080), Corporation. D. Ghosh is currently with and D. Ghosh has been supported by contracts from Semiconduc(94–DJ–553), SEMATECH (94–DJ–800), DARPA/ARO (P–3316– and (DAAG55-97-1-0345), and a grant from Semiconductor Research Intel Corporation. Name: Matthias Stallmann, Franc Brglez, and Debrabata Ghosh Affiliation: Dept. of Computer Science, North Carolina State University Address: To contact the first author, phone: 919-515-7978, email: Matt [email protected] Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or direct commercial advantage and that copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works, requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept, ACM Inc., 1515 Broadway, New York, NY 10036 USA, fax +1 (212) 869-0481, or [email protected]. 2 · M. Stallmann, F. Brglez, and D. Ghosh Categories and Subject Descriptors: D.2.8 [Software]: Software Engineering—metrics, complexity measures, performance measures; F.2 [Theory of Computation]: Analysis of Algorithms and Problem Complexity; G.2.1 [Mathematics of Computing]: Discrete Mathematics—combinatorial algorithms, permutations and combinations; G.2.2 [Mathematics of Computing]: Graph Theory—graph algorithms, trees; G.3 [Mathematics of Computing]: Probability and Statistics—experimental design; G.4 [Mathematics of Computing]: Mathematical Software— algorithm design and analysis General Terms: Experimental, Heuristic, NP-Hard, Optimization Additional Key Words and Phrases: Crossing number, graph drawing, graph embedding, design of experiments, graph equivalence classes, placement, layout 1. INTRODUCTION The minimization of the crossing number in a specific graph embedding has often been motivated by factors such as (1) improving the appearance of a graph drawing [Di Battista et al. 1999; Eades and Sugiyama 1990; E.R. Gansner, E. Koutsifios, S.C. North and K.P. Vo 1993; Mutzel 1998; Warfield 1977], (2) reducing the wiring congestion and crosstalk in VLSI circuits, which in turn may reduce the total wire length and the layout area [Leighton 1984; Marek-Sadowska and Sarrafzadeh 1995; Thompson 1979]. This paper is about bigraph crossing defined as follows for any bipartite graph (bigraph) G = (V0 , V1 , E) [Harary and Schwenk 1972]: Let G be embedded in the plane so that the nodes in Vi occupy distinct positions on the line y = i and the edges are straight lines. For a specific embedding f (G), the crossing number Cf (G) is the number of line intersections induced by f . This depends only on the permutation of Vi along y = i and not on specific x-coordinates. The (bigraph) crossing number C(G) = minf Cf (G). Garey and Johnson [Garey and Johnson 1983] proved that it is NP-hard to compute C(G). Detection of a biplanar graph, a bigraph G for which C(G) = 0, however, is easy [Harary and Schwenk 1971]. The two node sets V0 and V1 are also called layers of the bigraph. Previous work on bigraph crossing has focused primarily on the fixed-layer (version of the) problem, namely computing C(G) subject to the constraint that the permutation of V0 on y = 0 must stay fixed. Even that is NP-hard [Eades and Wormald 1994]. Work on heuristics for both fixed-layer and general bigraph crossing has mostly been theoretical [Di Battista et al. 1999; Eades and Whiteside 1994; Eades and Wormald 1994; Mäkinen 1989; Mäkinen 1990; Shahrokhi et al. 2001; Yamaguchi and Sugimoto 1999]. Experimental evaluation has focused on dense graphs, for which good lower bounds are available [Eades and Kelly 1986; Jünger and Mutzel 1997; Matuszewski et al. 1999], and relatively small sparse random graphs [Schönfeld et al. 2000; Jünger and Mutzel 1997; Matuszewski et al. 1999]. Graphs arising in circuit design are large (thousands of nodes, possibly millions), very sparse, and highly structured. We have used the dot ∗ package [E.R. Gansner, E. Koutsifios, S.C. North and K.P. ∗ The dot heuristic is only evaluated for its ability to minimize crossings in bigraph. It also does layer assignment and drawing of multi-layer graphs. Bigraph Crossing Minimization · 3 Vo 1993], intended for graph drawing, to determine placement in circuits [N. Kapur 1998], and found that, after a traditional routing package is applied (e.g [Betz and Rose 1997; K. Kozminski, (Ed.) 1992]), the wire length and area correlate well with crossing number [Ghosh et al. 1998a; Ghosh et al. 1998b]. In fact, this correlation appears to be independent of the quality of the heuristic used to minimize crossings [Ghosh 2000], so that better crossing minimization implies smaller wire length and area. This paper expands and consolidates previous experimental work on bigraph crossing by: (a) evaluating heuristics based on their performance on general (rather than fixed-layer) bigraph crossing, (b) presenting new heuristics that perform well on large instances related to circuit design and random connected graphs of the same sparsity, (c) doing careful comparison of new heuristics with dot and other heuristics in the literature, (d) presenting well-defined classes of graph data that illustrate the strengths and weaknesses of various heuristics, and, (e) mapping out directions for the development of heuristics and graph data that are most likely to be significant. Our work has potential impact on circuit design and beyond in several ways: (a) new heuristics for bigraph crossing will improve placement, (b) crossing number can be used as a defining characteristic of a graph or a class of graphs that share similar behavior under placement and routing heuristics; “benchmark circuits” can therefore be classes of distinct but similar circuits instead of individual circuits (see [Ghosh 2000; Ghosh and Brglez 1999; Kapur et al. 1997]), and (c) the methodology used to generate subjects and evaluate treatments can be emulated in other domains that call for the solution of NP-hard optimization problems. The performance of a heuristic can vary widely even on a single input graph, depending on the order in which the input is presented. This motivates us to define a presentation (of a graph G) as hG, π0 , π1 i, where πi is a permutation of Vi . As any heuristic implicitly sequences the input when it reads data, the presentation captures essential information about any of the common ways of describing a bigraph. If described as a list of neighbors for each V0 node, π0 describes the order of appearance of the V0 nodes while π1 is used to sort the adjacency lists. A list of edges is sorted using π0 as primary key and π1 as secondary key. Multiple randomly-generated presentations of a single graph G constitute an isomorphism class, an important class of experimental subjects — see Section 3. Quite conveniently, a presentation also yields an embedding of G: use πi to sequence the Vi nodes along y = i. Let C(hG, π0 , π1 i) denote its crossing number. The object of the bigraph crossing problem, then, is to compute the crossing number C(G) = minπ0 ,π1 C(hG, π0 , π1 i). Pushing this idea further, a heuristic h is a mapping from one graph presentation to another, that is h(hG, π0 , π1 i) = hG, π00 , π10 i and we can study its behavior statistically by looking at how a distribution on random hG, π0 , π1 i drawn from a class of presentations imposes a distribution on C(h(hG, π0 , π1 i)). Recent work at CBL † has focused on using distributions of C(h(hG0 , π0 , π1 i)) where G0 is derived from G (or is identical to it) to characterize G and synthesize graph equivalence classes based on G for the evaluation of circuit partitioning and placement heuristics † Collaborative Benchmarking Laboratory, North Carolina State University 4 · M. Stallmann, F. Brglez, and D. Ghosh [Ghosh 2000; Ghosh and Brglez 1999; Ghosh et al. 1998]. This paper focuses on the heuristics themselves and the experimental methodology for comparing their performance. A direct consequence of the work reported here is that, whereas the dot heuristic [E.R. Gansner, E. Koutsifios, S.C. North and K.P. Vo 1993] was used in earlier work characterizing graphs [Ghosh et al. 1998], a heuristic with better statistical properties (and faster execution time) is now used in its place [Ghosh 2000]. The paper is organized as follows: Section 2 presents the proposed heuristics, both existing and new, Section 3 describes the design of our experiments, with special focus on the data sets used, Section 4 summarizes experimental results, and Section 5 presents tentative conclusions and suggests further work. There are four appendices: one discusses validation of our implementations and experimental design with respect to previous work, another proves a fact relevant to a new type of experimental subject, a third gives detailed descriptions of some of the experimental subjects, and the last gives a brief description of software available on the worldwide web. 2. OVERVIEW OF THE HEURISTICS Our heuristics (experimental treatments) follow the scheme outlined in [Warfield 1977]. There are two phases: an initial ordering and iterative improvement. The former involves some global computation on the graph to sequence the nodes in each layer and the latter repeatedly solves a fixed-layer problem on alternating layers. Various combinations of initial ordering and iterative improvement are considered in our study. In our notation for heuristics and experimental subjects, n0 = |V0 |, n1 = |V1 |, and m is the number of edges. The expected number of crossings for an arbitrary presentation of a bigraph, as reported in [Warfield 1977], is m(m − 1)(n0 − 1)(n1 − 1)/4(n0 n1 − 1). 2.1 Initial Ordering Two observations suggest that a carefully computed initial ordering can avoid traps for subsequent attempts at improvement without incurring prohibitive execution time. The first, already mentioned in the introduction, is that it is easy to detect a biplanar graph [Harary and Schwenk 1971]. Finding the appropriate embedding is also easy [Eades et al. 1976] even though the best-known heuristics do poorly with biplanar graphs as input. Second, a cycle of length 2n has an optimum embedding with n − 1 crossings [Harary and Schwenk 1972]. That embedding is achieved if the nodes are sequenced in breadth-first order starting at any node of the cycle. We use two initial orderings in our experiments in addition to the random ordering for a total of 3 possibilities. One ordering uses a breadth-first search starting at a random node. This always gets the optimal solution for a simple cycle, and usually performs better than random ordering. 2.2 Guided Breadth-First Search. The other initial ordering heuristic is called guided breadth-first search (gbfs). The main breadth-first search is preceded by another search (also breadth-first) whose main function is to identify the longest path P in the graph. The main search then begins at one end of P and, at any branch point, follows all shorter paths before Bigraph Crossing Minimization n0 2 2 n2 n1 2 5 n1 1 1 2 c1 2 c0 · c1 2 c2 watch this dashed edge! n0 3 depth in first bfs 5 c0 4 c2 sequence in 2nd bfs 6 n2 Fig. 1. 1 n1 3 n0 6 n2 c1 c2 2 4 c0 5 Guided breadth-first search on a small example. continuing with P . For each node v the first search calculates dist[v], the distance from the start node, and depth[v], the maximum value of dist[w] achieved by any descendant w of v in the breadth-first search tree. The main search begins at a node s for which dist[s] is maximized. Adjacency lists are sorted by increasing depth and ties are broken by visiting nodes w with larger dist[w] first. Sequence numbers are assigned to nodes based on the order of visitation in the main search and these are used to sort nodes on each layer of the bigraph. All of the above can be accomplished easily in linear time: the depth values can be calculated by visiting nodes in reverse order at the end of the first bfs and letting depth[v] = maxw a child of v depth[w]. All sorting is based on values that range from 1 to n — bin sorts of all adjacency lists can be combined. Our actual implementation is not linear time but it runs fast enough in practice. It is easy to show that gbfs always obtains optimal solutions for biplanar graphs and simple cycles. Fig. 1 shows the two breadth-first searches and the final ordering of gbfs on a simple example. The first search begins at n0, a completely arbitrary choice. The depth[v] number is shown beside each node in the first search. When the dashed edge is present the crossing induced by the cycle is unavoidable and the order in which c0 and c2 are visited in the second search does not affect the outcome. Without the dashed edge, however, the graph is biplanar, and it is important to visit c2 first in the second search. Otherwise the edge c2 n0 will cross c0 n2 unnecessarily. The correct order is guaranteed by gbfs because the depth of c2 in the first search will always be less than that of c0. 2.3 Median and Barycenter Heuristics A typical way to improve an initial ordering is to apply a heuristic for fixed layer crossing, alternating repeatedly from layer to layer [E.R. Gansner, E. Koutsifios, 6 · M. Stallmann, F. Brglez, and D. Ghosh S.C. North and K.P. Vo 1993; Jünger and Mutzel 1997]. Each application of the fixed-layer heuristic is called a pass. While this is not the only possible paradigm for iterative improvement, it characterizes all three of the ones reported here, two popular previously-known heuristics and a new heuristic called adaptive insertion, described in Section 2.4. The median heuristic treats the neighbors of each node as a set of integers representing their ordinal numbers on the opposite layer. Nodes are sorted using the medians of these sets as keys. Implementations of the median heuristic differ in how the median of a set of even cardinality is computed. Ordinarily, the median would be defined as the mean of the two middle elements. The median heuristic introduced in [Eades and Wormald 1994] (see also [Di Battista et al. 1999; Mäkinen 1989]) and evaluated here always uses the smaller of the two candidates, but with the added condition that, in case of ties, nodes with odd degree always precede those of even degree. Assuming random initial ordering and a stable sort, ‡ the probability that the median heuristic embeds a simple path optimally is O(1/n). It embeds a simple cycle optimally every time. The barycenter heuristic uses the mean of the set of neighboring positions, rather than the median, as a key for sorting (the name comes from the fact that it is a one-dimensional analog of Tutte’s barycenter method for drawing graphs [Tutte 1960; Tutte 1963]). With random initial ordering it performs poorly relative to the median on paths, cycles, and other very sparse highly-structured graphs (since these require decisive movement of degree-2 nodes). On graphs that are more random and/or have several nodes of high degree, barycenter does better than median (high-degree nodes need to be centrally located with respect to their neighbors). Jünger and Mutzel [Jünger and Mutzel 1997] observed that the barycenter heuristic does better than the median on random graphs of various sizes and densities. We come to the same conclusion. However, as discussed later, our implementation of the barycenter heuristic and our random graph model differ from theirs (see also Appendix A). Our implementations allow median and barycenter to converge, that is, passes are done until no change occurs. Even though we were unable to prove termination, the number of passes never exceeded 4000, even on graphs with more than 15,000 edges. Each pass of the median heuristic can be implemented in linear time (the keys used for sorting are integers in the range 0, . . . , n − 1), while the barycenter requires O(m + n log n) per pass (m is the number of edges, n = n0 + n1 , the number of nodes). For graphs with a constant degree bound, the barycenter heuristic can also be implemented in linear time per pass. Our experimental implementations of both median and barycenter used insertion sort, which may be the fastest method in practice because the number of inversions decreases significantly during each pass when multiple passes are done. 2.4 Adaptive Insertion Heuristic Local search is a popular way to improve solutions to optimization problems. A simple operation on the current presentation is repeated until no instance of the ‡A stable sort does not change the order of items having the same key. Bigraph Crossing Minimization · 7 operation improves cost, i.e. reduces number of crossings. An example of an operation is neighbor swapping, the exchange of nodes (at positions) i and i + 1 on layer `. Such exchanges would be repeated until no choice of i and ` would yield a decrease in the number of crossings. The actual difference in number of crossings, call it d` (i, i + 1), depends only on the relative positions of neighbors of nodes i and i + 1 on layer 1 − ` and is therefore easy to compute [Di Battista et al. 1999, pp. 281–283]. Adaptive insertion generalizes local search based on neighbor swapping in two ways: (a) a larger “neighborhood” — each primitive operation inserts a node at any position among other nodes on its layer of the bipartition; this greatly increases the chance of finding an improved ordering; and (b) individual operations are batched into passes as in the Kernighan-Lin graph partitioning heuristic [Kernighan and Lin 1970] — each node is inserted exactly once during a pass. Details of adaptive insertion are based on experimental fine tuning. For example, we discovered that starting each pass where the previous one ended, rather than at the point of lowest cost in the previous pass, yielded significantly better results. This appears to be because it forces the heuristic to examine a larger set of presentations. The primitive operation of adaptive insertion inserts node i in a different position on its layer P `. Suppose i is inserted before j, where j < i. The resulting cost change, D` (i, j), is j≤k<i d` (i, k) — the effect of the insertion is that of a series of swaps of i with i − 1, . . . , j. The situation is symmetric when j > i and i is inserted after node j. One pass of adaptive insertion does a right-to-left sweep of nodes on a layer `. If the current node i has not already been inserted, the position j, before or after, with the best D` (i, j) is found. Nodes are not allowed to stay in place even if any insertion would increase the number of crossings. A node i is marked if (a) i is inserted in a primitive operation or if (b) i is immediately to the left of a node k when k is about to be inserted. Once marked, a node is not inserted (again) during the remainder of the current pass. Part (b) of the marking rule prevents the immediate undoing of a swap between a node and its left neighbor when such a swap does not decrease the number of crossings. To illustrate a pass of adaptive insertion, consider rearranging the V0 nodes of Fig. 2(a) by a sequence of insertions. Recall that the heuristic works from right to left, an arbitrary choice. First, node f finds its best position relative to the other nodes. Ties are broken by choosing the position closest to the current one, except that staying in place is not an option. The best position turns out to be an insertion before node e with no change in number of crossings — Fig. 2(b). Both e and f are marked, f because it has been inserted — rule (a) above — and e because of its position immediately to the left of f — rule (b). This makes node d the next one to be inserted into its best position, before node a. This insertion is equivalent to swapping d with each of c, b, and a in turn. Each swap can be evaluated independently [Di Battista et al. 1999]. Swapping d with c decreases the crossing number by 3 because edge cD no longer crosses dA or dB, and edge cB no longer crosses dA. Swaps of d with b and a each yield a decrease of 1 for a total decrease of 5 — Fig. 2(c). Nodes c and d are now marked. The next unmarked node is b, and its best insertion turns out to be to the right, · 8 M. Stallmann, F. Brglez, and D. Ghosh A B a b A C D c d e 22 crossings (a) initial presentation B C f A b c C D D d a B f d 22 crossings a f e * (+1) * * b * * * * e c 18 crossings (d) after the third insertion marked (+/−0) position of inserted node is marked also (b) after the first insertion A A B C a d * marked a b c f e * * * (−5) B C D D d c b f * * * * * (+1) e * 19 crossings (e) after the fourth and final insertion 17 crossings (c) after the second insertion Fig. 2. Adaptive insertion on a simple example. after c, yielding an increase of 1 — see Fig. 2(d). Finally node a is forced to move and finds its best position before d. The position after one pass of adaptive insertion is shown in Fig. 2(e). A single pass of adaptive insertion takes O(m2 ) time. If all adjacency lists are initially sorted using layer 1 − ` positions as keys (this can be done in linear time), the calculation of d` (i, k), and hence D` (i, k), for all k 6= i takes time O(mk) where k = deg(i). Summing over all nodes i gives the O(m2 ) bound. § For denser graphs the time per node can be reduced to O(m log k) using a simple data structure, for an overall bound of O(mn log(m/n)) per pass. Adaptive insertion has many similarities with sifting, as reported recently in [Matuszewski et al. 1999; Schönfeld et al. 2000]. The two main differences are (a) we do not allow a node to stay in its current position, and (b) we consider nodes in their current right-to-left order rather than based on their degrees. Restriction (a) introduces an element of unpredictability which appears to enhance the potential § Our implementation is slower than the analysis suggests — we use insertion sort to sort adjacency lists and, since we identify nodes by their position on a layer, extra costs are introduced when the position of a node changes. Bigraph Crossing Minimization · 9 solution quality produced by adaptive insertion (it does not get stuck in the manner of a local search). But the restriction also takes away any natural stopping criterion, leading to potentially long series of iterations in which no improvement takes place. 2.5 A New Heuristic Adaptive insertion appears to perform at least as well as the other iterative improvement heuristics and often considerably better. Unfortunately, the number of passes required to outperform median or barycenter is large and appears to grow nonlinearly with the number of edges (see [Stallmann et al. 1999a]). After much experimentation we engineered a combined iterative improvement heuristic that repeats the following two-part step until it fails to find a presentation with fewer crossings: (a) run the barycenter heuristic for enough passes to achieve convergence, then (b) run adaptive insertion for a fixed number of passes, alternating layers — we chose 2048 passes. The new heuristic, when combined with gbfs initial ordering, outperforms all the other we considered, but has prohibitive execution times on the larger graphs. We use it primarily as an indicator of the existence of better solutions, and perhaps as a precursor of a more efficient heuristic that performs equally well. 2.6 The Dot Heuristic The dot heuristic used in our experiments is a modified version of the graph drawing package described in [E.R. Gansner, E. Koutsifios, S.C. North and K.P. Vo 1993] — the final phase that computes the geometry of the drawing has been eliminated. The revised main program only outputs the number of crossings. ¶ Also eliminated is a phase that decomposes the graph into layers — in our experiments, the decomposition into two layers is given. Dot begins with an initial ordering based on depth-first, instead of breadth-first, search. This is followed by iterative improvement based on repeated application of the fixed-layer median heuristic. The median as implemented in dot uses the mean when there are exactly two elements, and a biased mean — biased toward the side on which neighbors are closer together in the current ordering — for even cardinalities > 2. Each time dot applies the median heuristic to layer `, a local search performs swaps of neighbors on layer ` until no swap is able to decrease cost. Dot alternates application of the median/swapping passes until a fixed number of passes (10 in the implementation we used) has resulted in no cost reduction. 2.7 Experimental Treatments Our software uses the treatment number nomenclature of an earlier conference paper [Stallmann et al. 1999b]. Table 1 gives a brief overview of the combinations discussed here. Treatment numbers are non-consecutive — some treatments were omitted because they failed to contribute anything of interest to the results reported here. The output presentation of treatment tr00 is identical to the input presentation — it therefore plays the role of a control in our experiments. We recently added three more treatments to have a basis for comparison with ¶ An additional version that outputs the permutation on each layer was developed so that we could verify the results, but it runs significantly slower. · 10 M. Stallmann, F. Brglez, and D. Ghosh Treatment Initial Ordering Iterative Improvement Time ∗ tr00 random † none O(n) tr01 random † median O(n) tr03 random † barycenter O(n lg n) tr06 bfs none O(n) tr07 bfs median O(n) tr09 bfs barycenter O(n lg n) tr12/dot ‡ dfs median/swap § O(n2 ) tr14 gbfs none O(n) tr15 gbfs median O(n) tr17 gbfs barycenter O(n lg n) tr23 gbfs adaptive-insertion/barycenter ¶ O(n2 ) tr24 random median k O(n2 ) tr25 random barycenter k O(n2 ) tr30 random median k∗∗ O(n2 ) ∗ Asymptotic time of an optimal implementation based on number of nodes n — we assume that graphs have O(n) edges; time per iteration is given if iterative improvement dominates. † No modification takes place — input presentation is assumed to be random. ‡ We refer to this as dot rather than tr12 in what follows. § A pass consists of one application of the median heuristic followed by repeated swapping of neighbors on the same layer until there is no further improvement; passes are done on alternating layers until there is no improvement. ¶ Barycenter alternated with 2048 passes of adaptive insertion; the number of passes could have been greatly decreased for all but the largest graphs without any change in the results. k Stops when there is no reduction in crossings. ∗∗ Averages two middle elements when taking the median of an even-cardinality set. Table 1. Various heuristics (treatments) used in our experiments. Numbering is based partially on an earlier conference paper and partially on continued experimentation. the recent experimental work of others [Jünger and Mutzel 1997; Matuszewski et al. 1999]. When the median and barycenter heuristics are used iteratively on alternating layers, experiments by previous authors iterate until there is no decrease in the number of crossings. Our iterative versions of median (tr01) and barycenter (tr03) keep iterating until there is no change in the order of nodes on either side and then use the final presentation as output even if solutions with fewer crossings were encountered earlier. This has the advantage that there is no need to update the number of crossings after each iteration. The treatments tr24 and tr25 use the same median and barycenter heuristics for each iteration as tr01 and tr03, respectively. However, tr24 and tr25 stop iterating when the solutions produced during the last two passes (one on each layer) have no fewer crossings than the best previous solution. And, while the sorting algorithm used in tr01 and tr03 is stable (to ensure convergence), tr24 and tr25 Bigraph Crossing Minimization Heuristic tr25 ∗ crossings time iterations tr03 † crossings time iterations tr17 ‡ crossings time § iterations dot crossings time iterations k tr23 ∗∗ crossings time iterations · 11 Results for random trees Graph Size (m, n0 , n1 ) (230,115,116) (470,235,236) (950,475,476) (1910,955,956) [1040,1140] [0.08,0.25] [8,38] [3420,3810] [0.25,1.13] [13,72] [9880,10700] [1.45,3.55] [35,104] [28800,31200] [6.12,15.3] [55,169] [965,1040] [0.02,0.05] [14,66] [2970,3150] [0.13,0.22] [50,142] [8350,8890] [0.53,0.7] [92,240] [22900,24000] [2.38,3.35] [196,536] [420,474] [0.02,0.05] [5,29] [1300,1430] [0.08,0.15] [9,34] [3800,4220] [0.37,0.48] [16,85] [10900,11800] [1.55,1.95] [32,165] [645,743] —¶ — [1940,2250] [1,2] — [6110,6980] [7,9] — [17800,20700] [36,69] — [231,296] [49,50] [3,3] [708,978] [205,207] [3,3] [2170,2930] [834,3236] [3,10] [5730,7390] [3434,15909] [3,14] ∗ barycenter, iterated until there is no improvement barycenter, iterated until there is no change on either side ‡ barycenter, same as tr03, with a good initial placement § does not count time for initial placement ¶ too small to be measured k not able to count iterations for the dot heuristic ∗∗ good initial placement, adaptive insertion, and barycenter; results based on 8 subjects, not 64 † Table 2. Number of crossings versus execution time for various iterative improvement heuristics. Each heuristic was executed on 64 random graphs with the specified number of nodes and edges. A 95% confidence interval about the mean is shown for number of crossings. The intervals for time and iterations represent ranges from minimum to maximum. Execution time is in seconds on an UltraSparc IIi, 360 MHz, with 256 MB memory. Each iteration of a heuristic other than tr23 is one application of the heuristic to each layer. An iteration of tr23 is 2048 passes of adaptive insertion followed by iterations of barycenter until there is no change (tr03). use an unstable sort since it gives slightly better results. Finally, tr30 is introduced because, among several attempted implementations of an iterated median heuristic, it comes closest to approximating the results reported for the median heuristic in [Jünger and Mutzel 1997] (see Appendix A for more details). The key feature that distinguishes tr30 from tr24 is that it takes the mean of two middle elements when a node has an even-cardinality adjacency list instead of using the subtler approach of [Eades and Wormald 1994]. Table 2 shows number of crossings and execution times reported for the leading iterative improvement heuristics on random trees of increasing sizes. Execution times for initial placement, even with insertion sort being used during placement, were negligible in comparison with those for iterative improvement. A comparison 12 · M. Stallmann, F. Brglez, and D. Ghosh of tr25 with tr03 shows that our absolute convergence strategy pays off — tr03 reports significantly fewer crossings and runs in less time. It is able to do more iterations in less time because there is no need to update crossings. The results shown are typical and hold for other types of graphs as well. The combined heuristic tr17 obtains solutions with many fewer crossings than tr03 using fewer iterations, the key being an excellent initial placement. This phenomenon is not universal — there are even graphs, for example, hypercubes, on which tr03 outperforms tr17. The dot heuristic achieves less with longer execution times than tr17 on most classes of graphs — it appears to spend a lot of extra effort swapping neighbors for only a small reduction in the number of crossings. On the other hand, tr23 sometimes gets dramatic reductions in the number of crossings and is used to indicate how far from optimal the other solutions are. Because of the prohibitive execution times of tr23, we usually ran it on only 8 random subjects from a class of 64. Results are not affected significantly because the variance of tr23 is relatively small. 3. EXPERIMENTAL SUBJECTS A challenging aspect of this work is devising a collection of experimental subjects that leads to well-reasoned conclusions about the quality of the various heuristics and that can also serve as benchmarks for future heuristics. Ultimately we want to explore ways of generating graph equivalence classes that have the same characteristics as specific reference graphs. This, however, is an ongoing effort [Ghosh 2000], beyond the scope of this paper. For now we believe our proposed classes energetically address the issue of differentiating among present and future heuristics. A distinguishing feature of our work is the use of both random and isomorphism classes of graphs instead of isolated benchmarks or generic random graphs to examine the behavior of heuristics. A random class contains graphs (64 in our experiments) generated randomly given n0 , n1 , m, and possibly other information. An isomorphism class consists of random presentations of the same graph G — in some cases G is generated randomly, in others it is designed deterministically. Every isomorphism class for a graph G has a corresponding random class so that we can answer the question: how typical is G of graphs having the same number of nodes and edges? Conversely, every random class has an isomorphism class constructed from one of its members — thus, we can see if the observed behavior of heuristics on the random class is an average that smoothes out widely varying behavior, or whether behavior of any individual instance typifies behavior for the class. The current work strikes a balance between parameterized random graph classes and isomorphism classes based on specific graphs occurring ‘naturally’ in network design or contrived to have specific characteristics. We describe the design of these equivalence classes of experimental subjects here. The size of a graph is always measured by its number of edges m. However, a bigraph type, consisting of similar bigraph classes of various sizes can be characterized by two measures that allow n0 and n1 to be derived from m. The density factor a = m/(n0 + n1 − 1) is the ratio between the size of the graph and that of one of its spanning trees †† . We can also specify the balance b = n0 /n1 . Now let †† [West 1996, page 362] defines edge density as m , n0 +n1 essentially the same as our density factor. · Bigraph Crossing Minimization name n0 13 n1 m a bm c + 1 − n0 a 15 · 2k − 10 — b m+1 c 2 m + 1 − n0 15 · 2k − 10 1 Gr,k (even k) 2k−1 2k−1 2 · (2k − 2 2 ) Gr,k (odd k) 2k−1 2k−1 random graphs, 1 ≤ k ≤ 8 R(a, 15 · 2k − 10), a = R(1, 15 · 2k d = 1, 33 17 9 5 , , , , 2,4, 8 32 16 8 4 b bm c+1 a c 2 − 10, d), 1001 101 11 , , ,2, ∞ 1000 100 10 grid graphs, 1 ≤ k ≤ 10 k 2k+1 − 3 · 2 k−1 2 ≈2 ≈2 butterfly graphs, 1 ≤ k ≤ 7 (k/2 + 1)2k Gb,k (even k) (k + 1)/2 · Gb,k (odd k) 2k (k/2) · 2k (k + 1)/2 · 2k 2k · 2k ≈2 2k ≈2 2k · hypercubes, 1 ≤ k ≤ 9 Gh,k 2k−1 2k−1 k · 2k−1 k/2 3 · 2k − 2 12 · 2k − 7 15 · 2k − 10 1 VLSI graphs, 2 ≤ k ≤ 7 G00,k , G01,k G10,k , G11,k G20,k , G21,k Table 3. 3· 2k 3· 2k −2 −2 9· 2k−1 3· 2k−2 −1 +4 15 · 2k − 10 ≈2 15 · 2k − 10 ≈4 Parameters that characterize the proposed graph classes. n = bm/ac + 1, n0 = bn/(b + 1)c, and n1 = n − n0 . The choice of b has little impact on the number of crossings reported by any heuristic except in extreme cases (small graphs and/or extreme values of b). Table 3 summarizes our experimental subject classes. Within each bigraph type, the size is controlled by the parameter k, called the rank. With the exception of butterfly graphs and hypercubes, an increment in rank corresponds roughly to a doubling in size. We discuss the table in more detail in Section 3.4 after describing the bigraph types. 3.1 Random Graph Classes Not all random graphs are created equal — with respect to suitability for crossing minimization experiments, that is. This is illustrated plainly in Table 4. Our random graph generator creates connected random graphs by first running a minimumspanning-tree algorithm with random distances between nodes to create a random spanning tree. Such a tree is later called a dimension-∞ tree. Additional edges, if any are needed, are added by randomly selecting them from the edges not used in the spanning tree. The not connected graphs are generated using our own implementation of the random bipartite graph generator from GraphBase [Knuth 1993]. The contrast between the rows marked [1] and [2] speaks for itself. The crossing numbers of random placements reported by tr00 are close to the expected ones of Warfield’s formula [Warfield 1977] and differ little among graphs of the same size. However, the minimum crossing number — and that produced by good heuristics 14 · M. Stallmann, F. Brglez, and D. Ghosh Size (m, n0 , n1 ) (230,115,116) a=1 [1] connected [2] not connected (230,97,97) a = 1.1875 [3] connected (230,58,58) a=2 [4] connected [5] not connected (470,235,236) a=1 [1] connected [2] not connected (470,198,198) a = 1.1875 [3] connected (470,118,118) a=2 [4] connected [5] not connected (950,475,476) a=1 [1] connected [2] not connected (950,400,401) a = 1.1875 [3] connected (950,238,238) a=2 [4] connected [5] not connected (1910,955,956) a=1 [1] connected [2] not connected (1910,804,805) a = 1.1875 [3] connected (1910,478,478) a=2 [4] connected [5] not connected tr00 tr07 tr17 tr23 Heuristic tr00 tr07 tr17 tr23 [12800,13100] [12800,13100] [749,880] [2540,2860] [420,474] [1710,1820] [232,296] [1320,1660] [12800,13000] [2560,2670] [1870,1950] [1500,1690] [12600,12900] [12700,12900] [5520,5640] [5560,5680] [4670,4770] [4760,4840] [4360,4750] [4440,4680] [54300,55200] [54200,55000] [2490,2810] [11000,12300] [1300,1430] [6790,7070] [708,978] [5590,6590] [54500,55100] [10600,11000] [7330,7540] [5960,6500] [53300,54000] [53800,54500] [23300,23700] [23500,24000] [19300,19500] [19600,19800] [18500,19100] [18400,19000] [223000,225000] [223000,226000] [7760,8840] [44200,48900] [3800,4220] [27300,28100] [2170,2930] [22500,24700] [224000,226000] [43900,44800] [28700,29200] [22900,24300] [223000,225000] [223000,225000] [95500,96700] [97600,98800] [78200,78700] [79600,80300] [74500,76100] [75800,78300] [908000,915000] [24800,28400] [10900,11800] [911000,918000] [173000,191000] [107000,110000] [5730,7390] [90600,99400] [906000,911000] [179000,182000] [113000,114000] [96200,99400] [906000,912000] [387000,391000] [313000,316000] [300000,305000] [904000,910000] [394000,398000] [322000,323000] [308000,314000] Legend random order breadth-first search + median; best linear-time heuristic gbfs + barycenter; best sub-quadratic time heuristic best solution quality; long execution time; results are based on a sample of 8 instead of 64 Table 4. Number of crossings for connected versus not necessarily connected random graphs. The intervals are 95% confidence intervals with respect to the mean number of crossings for 64 random graphs of each category. Bigraph Crossing Minimization · 15 — differs widely between trees, row [1], and random graphs with the same number of edges as trees, row [2]. The latter have many isolated nodes, some small components, and usually one larger component that is denser than a tree (see, e.g. [Palmer 1985], for theoretical background). We have included row [3] to show that the behavior of random not-necessarily-connected graphs with density factor a = 1 3 , graphs with the same closely resembles that of connected graphs with a = 1 16 number of edges but fewer nodes. Rows [4] and [5] repeat the comparison of rows [1] and [2], this time with a = 2, a larger density factor. The difference here is much less pronounced, as can be expected — denser random graphs are more likely to be connected and their largest connected component is likely to be larger if they are not. The differences in performance among the better heuristics are also less striking. The decision to use only connected random bigraphs in our experiments was clear from the beginning. We felt that (a) our random graphs should behave like our deterministically generated graphs with the same number of nodes and edges, (b) heuristics should not be inadvertently judged on the basis of how well they solve the problem of separating connected components, (c) our measures of graph density should not be skewed by the fact that the largest component in a multicomponent graph is denser than the graph itself, and (d) finding a close-to-optimal embedding of a connected graph is likely to be harder than finding one in a graph with multiple, mostly small, components. Our test data includes random graph classes with a = 1, 1.03125, 1.0625, 1.125, 1.25, 2, 4, 8. Sizes of random graphs range from 20 edges (rank 1) to 3830 edges (rank 8), each rank having twice as many as the previous rank, plus 10. The choice of sizes has historical reasons — early experiments were done on specific circuitbased graphs and we wanted random graphs with the same sizes for comparison. 3.2 Random Trees When creating a random tree we already know n0 and n1 and hence must assign each node to one of V0 or V1 before creating a random spanning tree. For the dimension∞ trees used to ensure connectivity, we choose cost(v0 , v1 ), where v0 ∈ V0 and v1 ∈ V1 , at random. The bipartition is preserved by letting cost(x, y) = ∞ when x and y are on the same layer. The cost of a potential tree edge can also be based on a distance between points in a metric space of a given dimension. For example, random dimension-2 trees are created by mapping each node to a random point in the unit square and letting cost(v0 , v1 ) be the ∞-norm distance between v0 and v1 (equivalently, we could use a unit circle and the Euclidean norm). These behave quite differently from dimension-∞ trees with respect to the heuristics. Dimension-(1 + α) trees, where 0 ≤ α < 1 use random points in a 1 × α rectangle instead of a unit square. When α = 0 the coordinates are points on a line segment. The set of possible dimension-1 trees turns out to be exactly the set of biplanar graphs — see Appendix B for proof of this. As α approaches 0 and tree size stays fixed, the minimum number of crossings found by the better heuristics in dimension-(1 + α) trees also approaches 0. Our tree classes — classes with a = 1 — include random trees with dimension d = 1, 1.001, 1.01, 1.1, 2, and ∞. · 16 0 1 2 a5 3 M. Stallmann, F. Brglez, and D. Ghosh 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 a4 a8 a1 a3 a0 a9 a6 a2 a7 (a) The biplanar tree G00,2 (easy, meaning few crossings) 21 22 23 24 7 17 18 19 20 5 a5 a4 6 8 1 33 34 35 36 13 14 16 3 15 0 a1 a8 a3 a0 4 2 37 38 39 40 25 26 27 28 9 10 12 11 29 30 31 32 a9 a6 a2 a7 (b) The tree G01,2 (hard, meaning many crossings) Fig. 3. Two VLSI trees (density factor a = 1). 3.3 Isomorphism Classes For our isomorphism classes we use bigraphs that are well-known in the literature and some that were specifically designed to reflect the behaviors we encountered in circuit-design applications. For the latter, we develop our own graph types, mainly to explore issues that were not addressed by the other graph types. 3.3.1 Communication Graphs. Our collection of isomorphism classes includes some well-known bipartite communication graphs — grids, butterflies, and hypercubes (see, e.g., [Leighton 1992]). The grid Gr,k has 2k nodes, forming a square when k is even and a rectangle with 2:1 aspect ratio when k is odd. In either case, a node at position (i, j) is connected to nodes at (i − 1, j), (i, j − 1), (i + 1, j), and (i, j + 1). One or two of these edges may be missing for nodes on a boundary or in a corner, respectively. Nodes with even i + j are in V0 , those with odd i + j in V1 . The butterfly graph Gb,k has nodes defined by ordered pairs (i, w), where i ranges from 0 to k and w is a string of k bits. Edges are of the form (i, w)(i + 1, w0 ), where w0 is either the same as w or differs from w only at the i-th bit (i = 0, . . . , k − 1). Hypercubes, the only graphs in our study whose density factor a increases with size, have 2k nodes. Each node is denoted by a string of k bits. The set of edges is {xy | x, y differ in exactly one bit }. A 2k -node hypercube is denoted by Gh,k . 3.3.2 Graphs Motivated by Circuit-Design Applications. We consider isomorphism classes of graphs specifically designed to test the suitability of various treatments for the circuit layout applications that motivate this work. In circuit applications, V0 typically represents a set of cell nodes and these often have fixed degree (5 in our examples). The nodes in V1 are net nodes; an alternate abstraction for a net node is a hyper-edge consisting of all the cell nodes to which it is adjacent. The circuit, or VLSI, graphs are designated Gxy,k where x = 0, 1, or 2 (meaning a = 1, 2, or 4), y = 0 or 1 (meaning a small or large number of crossings), and k ranges from 1 to 7 (the rank — number of edges is 15 · 2k − 10 as in the random graphs). What follows is a brief description of the ideas behind the VLSI graphs. A more detailed account is given in Appendix C. The G00,k graphs are biplanar — Fig. 3(a) shows G00,2 . A “hard” tree is created Bigraph Crossing Minimization b0 b1 b2 b3 b4 b5 b6 b7 a1 10 crossings a2 4 crossings a3 10 crossings (a) The graph G10,1 (easy, meaning relatively few crossings) b0 b1 b6 a0 b2 a1 b3 b7 a2 b4 b5 a3 (b) The graph G11,1 (hard, meaning many crossings) Fig. 4. 17 In G10,1 , the ‘easy’ VLSI graph with a = 2, the 5 neighbors of µ a2i ¶ and a2i+1 are the same, resulting in a0 · 5 2 = 10 crossings among their edges. Furthermore, the nodes a2i+1 and a2i+2 have 2 neighbors in common. Since these neighbors are also common to a2i and a2i+3 , an additional 4 crossings result from this overlap. In G11,1 , the ‘hard’ VLSI graph with a = 2, the nodes a2i and a2i+1 share only 4 neighbors. The 5th edge is used to connect to a bj that, in turn, connects to a ‘far away’ ak . As in G10,1 , each pair shares 2 neighbors with the next pair. If the ‘far away’ connections — via the circled µ ¶ bi nodes — are ignored, there are 4 = 6 crossings for each pair and 2 4 crossings due to overlap of neighboring pairs. Two VLSI graphs with density factor 2. by taking a complete binary tree and adding extra leaves to give the cell nodes a degree of 5. Fig. 3(b) shows G01,2 in an embedding that has 28 crossings, the minimum any of our treatments were able to find. The number of edges and therefore the number of cell nodes is the same for all VLSI graphs of any given rank. To increase density while keeping rank the same, the number of net nodes is decreased. For example, to achieve a density of 2, a VLSI graph has roughly 3n0 /2 net nodes. A density of 4 is achieved with roughly n0 /4 net nodes. Easy, or low-crossing, graphs are obtained by simply making each cell node adjacent to the 5 “closest” net nodes based on an a priori ordering of nodes of each kind. Fig. 4(a) shows G10,1 . To increase the number of crossings and confound the heuristics, the hard, or high-crossing, graphs set aside a fraction of the nodes for far-away adjacencies — each net node is adjacent to at least two cell nodes that are far apart in the original ordering. See Fig. 4(b) for a good embedding of G11,1 — the circled net nodes are the ones used for far-away adjacencies. 3.4 Summary We are now ready to take a more detailed look at Table 3. Random classes are denoted R(a, m, d), where d defaults to ∞ if it is omitted (random spanning trees for graphs with a > 1 are always generated using d = ∞, for example). Each isomorphism class is denoted by the name of the subject (also called reference graph) used to generate it; for example, G20,5 denotes a class consisting of 64 random presentations of G20,5 . 18 · M. Stallmann, F. Brglez, and D. Ghosh We also use two additional pieces of notation: rnd (X), where X is an isomorphism class, is a random class R(a, m), where a and m are the density factor and number of edges of X; and iso(R), where R is a random class, is an isomorphism class generated from one randomly-chosen member of R. Recall that a bigraph type is a collection of classes that differ only in size (number of edges). Table 3 shows 23 basic bigraph types — 7 different a > 1 and 6 different d with a = 1 for random graphs; even (square) and odd (rectangular) grids (counted as two distinct types), butterflies, hypercubes ∗ , and 6 types of VLSI graphs. For each type T of 7 isomorphism types, there is a type rnd (T ) derived by creating a random graph class with the same m, n0 and n1 as each class in T (there are really 10 isomorphism types, but G00,k has the same parameters as G01,k , etc.). Now for each type T of the 20 random types discussed so far, there is a type iso(T ) derived by selecting a random member of each class in T and creating an isomorphism class from it. The total number of classes is 358 (meaning 22, 912 experimental subjects were treated, 64 per class) — 8 sizes for each of 26 classes, the random and their derived isomorphism classes, 10 sizes of grid graphs (times 3 because of the derived random and its derived isomorphism class), 7 of butterflies (times 3), 9 hypercubes (times 3), and 6 size classes for each of 12 VLSI types (6 original, 3 derived random, and 3 derived isomorphism from the random). For all classes listed in Table 3, the average number of crossings reported by tr00, the random placement, is within 0.2% of the expected crossing number reported in [Warfield 1977]: m(m − 1)(n0 − 1)(n1 − 1)/4(n0 n1 − 1). To see the relationship between isomorphism classes and random classes, consider the histograms in Fig. 5. The top row shows number of crossings obtained by three treatments (tr01, tr09, and dot) on a 64-member isomorphism class of G01,7 . In the second row are results for the class rnd (G01,7 ) — 64 randomly generated trees with dimension d = ∞ and the same n0 , n1 , and m as G01,7 . The third row shows iso(rnd (G01,7 )), the isomorphism class of a random member of rnd (G01,7 ). A weaker treatment, such as tr01, only takes slight advantage of the special structure of G01,7 , as seen in the first column of Fig. 5. The other two columns show treatments that perform equally well on an average run with G01,7 . Dot (rightmost column) has a large variance, even on the isomorphism classes. For G01,7 this is an advantage — dot is able to find solutions with fewer crossings than tr09. For random graphs, however, it appears to be a disadvantage — tr09 consistently outperforms dot on these. 4. TREATMENT EVALUATION Treatments can be evaluated in many ways. Our emphases in this section are (a) to illustrate performance of treatments relative to each other, (b) to establish which experimental subjects are most useful for distinguishing treatments from each other, and (c) to suggest the most fruitful avenues for development of future heuristics. We begin in Section 4.1 by looking at the effect of the density factor on relative performance. Section 4.2 considers the variations that result from different types ∗ Technically, hypercubes are not a bigraph type because different size hypercubes have different density factors. Bigraph Crossing Minimization 50 · 19 Number of Crossings (TR09) 40 20 Number of Crossings (DOT) Number of Crossings (TR01) 20 30 15 15 20 10 10 10 5 5 0 20 0 0 [140000, 200000], µ = 167380, σ = 7402.3 Number of Crossings (TR01) [5000, 30000], µ = 9900.97, σ = 176.749 20 Number of Crossings (TR09) [5000, 30000], µ = 10165.2, σ = 2036.02 20 15 15 15 10 10 10 5 5 5 0 0 0 [140000, 200000], µ = 178076, σ = 6978.82 20 Number of Crossings (TR01) [5000, 30000], µ = 15975.2, σ = 3014.38 20 Number of Crossings (TR09) [5000, 30000], µ = 19000.9, σ = 4417.65 20 15 15 15 10 10 10 5 5 5 0 0 0 [140000, 200000], µ = 176808, σ = 6199.56 [5000, 30000], µ = 14073.2, σ = 1547.85 Number of Crossings (DOT) Number of Crossings (DOT) [5000, 30000], µ = 19492, σ = 4104.72 Each of three treatments is applied to 64 subjects of each of 3 classes. The columns represent tr01 (median), tr09 (bfs+barycenter), and dot, respectively. The rows represent G01,7 (64 presentations of the “hard” VLSI tree), rnd(G01,7 ) (64 random graphs having same size and density as G01,7 ), and iso(rnd(G01,7 )) (64 presentations of one of the random graphs). Fig. 5. A comparison of isomorphism and random classes. · M. Stallmann, F. Brglez, and D. Ghosh Average number of crossings reported 1000000 1000000 00 01 100000 06 14 03 07 • 15 dot 09 10000 Average number of crossings reported 20 17 + 23 5000 01 03 07 • dot + 09 14 15 17,23 100000 A random graph (a=2, m=1910) (c) Average number of crossings reported Average number of crossings reported 00 06 01 06 dot • 07 100000 03 09 14 15 +17 23 10000 5000 A random graph (a=1.125, m=1910) (b) A random graph (a=1, m=1910) (a) 1000000 00 1000000 00 01 03 09 07 dot • 17,23 + 06 14 15 100000 A random graph (a=4, m=1910) (d) Fig. 6. A comparison among the various treatments for random graphs with increasing density. The averages are based on 64 random presentations of each graph. Similar results are obtained if 64 random graphs from each class are chosen. of experimental subjects. In Section 4.3 we turn to the asymptotic relative performance of treatments on both random and isomorphism classes. We conclude with other interesting observations in Section 4.4. We use decomposition diagrams like those in Fig. 6 to show not only the number of crossings reported by all treatments but also the ingredients of each treatment — see Table 1 on page 10. A solid line from t1 to t2 with t2 to the left of t1 indicates that t2 is t1 followed by iterative improvement, either in the form of the median (one column to the left) or barycenter (two columns). A dashed line from t1 to t2 with t2 to the right of t1 indicates that t2 is t1 preceded by initial placement, either bfs (one column) or gbfs (two columns). The choice of left/right and solid/dashed is arbitrary. Treatments dot and tr23 are shown separately in the center column. Bigraph Crossing Minimization · 21 For example, tr07 in the diagram can be reached from tr00 by following either the dashed line to tr06 and the solid line to tr07 or the solid line to tr01 followed by the dashed line to tr07. These two paths illustrate two different ways to isolate the ingredients of tr07. In either case, we can see how much is contributed by initial placement tr06 (bfs) — the dashed lines to the right — and how much by iterative improvement tr01 (median) — the solid lines to the left. The relative importance of the two ingredients of a hybrid heuristic such as tr07 on a particular graph is immediately evident from the decomposition diagram. The y-axis in the decomposition diagrams is logarithmic scale so that vertical distance between treatments is proportional to the ratio between their reported crossing numbers. 4.1 The Effect of Density Jünger and Mutzel [Jünger and Mutzel 1997] report that among random graphs only sparse ones exhibit significant differences in the performance of treatments. Fig. 6 corroborates this point — an increase in density factor leads to significant decrease in ratio between treatments. When a = 4 even the best of our treatments yields only about a 44% improvement over random placement — 4.9 × 104 versus 9 × 104 crossings. When a = 8 this decreases to 29% — 6.2 versus 9.0 (not shown). The best improvement we get over tr07, a simple linear-time heuristic, when a = 4 is 9% — 4.9 versus 5.5. When a = 8, it is only 5.5% — 6.2 versus 6.7. Even if such small differences are sought after, the ranking of the treatments does not appear to change beyond a = 2. Furthermore, the 5 best treatments have mutually overlapping 95% confidence intervals with respect to the mean number of crossings for a > 2. Ranking of treatments does, however, change in subtle ways as we progress from trees to denser graphs. Initial placement is at least as important as iterative improvement for trees but diminishes drastically in importance even for very sparse non-trees. Note the positions of treatments tr06, tr07, tr14, and tr15 relative to tr03 in Fig. 6(a) and (b). Initial placement, yielding a difference as dramatic as iterative improvement on the sparser graphs, is much less important for the denser ones. This can also be seen by the position of tr03 relative to tr09 and tr17 — a t-test here is unable to establish any improvement in performance when better initial placement is added to the barycenter heuristic. The ranking of dot is best for trees but worst for the sparsest non-trees. This phenomenon appears to be solely the result of dot’s initial placement strategy, depth-first rather than breadth-first search. Dot’s iterative improvement performs as expected: better than the median heuristic but not quite as good as barycenter. The relative position of tr23 versus tr17 suggests less about any particular feature of tr23 than it does about the potential for improved heuristics. With increasing density it becomes less likely that any enhancement will yield dramatically lower crossing numbers. The discontinuity in our results, both in terms of actual number of crossings reported and other trends, between trees and extremely sparse other graphs suggests that arboricity (the maximum density factor of any node-induced subgraph) may have an influence on crossing number, something already suggested by the upper bound of [Shahrokhi et al. 2001]. The graphs in Fig. 6 all have the same number of 22 · M. Stallmann, F. Brglez, and D. Ghosh edges. A small decrease in the percentage of spanning-tree edges represents only a small increase in overall density factor, but the density factor of some node-induced subgraphs can increase dramatically. 4.2 Which Subjects are Interesting We now look at features other than density factor and how they affect our results. Fig. 7 shows decomposition diagrams for 4 graphs, all of which have density factors of approximately 2 and roughly the same number of edges. The ranking of treatments for the butterfly (Fig. 7(a)) is the same as for the random graph with a = 1.125 (not a = 2), but differences in crossings are not as great (yet still greater than for the random graph with a = 2). Square grids (Fig. 7(b)) are very sensitive to initial placement. Applying the median heuristic after a breadth-first search increases rather than decreases the number of crossings! Ironically, dot’s improved position in the rankings is not the result of initial placement — in fact, the number of crossings after initial placement is higher even than that of tr01. Depth-first search is a particularly bad way to sequence the nodes of a square grid. Here it is the weighted median combined with swapping of neighbors that yields a dramatic reduction. Even more dramatic cases where initial placement trumps iterative improvement occur in results for the two VLSI graphs — Fig. 7(c) and (d). The hard VLSI graph G11,7 favors bfs slightly over gbfs. The median applied after either search leads to an increase in the number of crossings. The easy graph G10,7 renders our iterative improvement heuristics almost completely superfluous — neither median nor barycenter can “deduce” the simple linear structure of G10,k on their own, nor can they do much to improve an initial placement that does. Even more extreme differences can be seen among trees. The hard VLSI tree G01,7 seems particularly amenable to good iterative improvement via barycenter — see Fig. 8(a). These decomposition diagrams cover a much larger range on the y-axis, so the differences among heuristics are actually much larger than they might appear relative to those of Figs. 6 and 7. Here, the good relative performance of dot is, again, based on initial placement. Depth-first search is better than breadth-first search for trees. In fact, for several of the 64 presentations in the isomorphism class, dot does as well as tr23 and its initial placement outperforms all treatments but tr23. For the easy VLSI tree — G00,7 , Fig. 8(b) — treatments that use gbfs all report 0 crossings and are not shown. The differences are as large as any we’ve seen, and initial placement determines the quality of solutions — except in the case of dot (a depth-first search on a biplanar tree yields a good solution only if the starting point is at the end of one of the longest paths). Dot has a large variance, reporting solutions with as few as 20 crossings or as many as 1900. Figs. 8(c) and (d) show two random trees with different dimension. The one with d = ∞ is the same as the one in Fig. 6(a) (but the scale is different). The d = 1.1 tree represents a large step in the direction of biplanar trees. The main changes are the increasing advantage of gbfs over bfs and the decreasing effectiveness of (the initial placement phase of) dot. · Bigraph Crossing Minimization 1000000 1000000 23 00 01 06 dot 03 09 • +17,23 07 100000 14 15 Average number of crossings reported Average number of crossings reported 00 01 100000 15 + 09 06 14 17,23 10000 10000 A square grid (a=2, m=1984) (b) A butterfly graph (a=2, m=1784) (a) 1000000 1000000 00 01 100000 03 07 • dot + 17,23 09 15 14 06 10000 Average number of crossings reported Average number of crossings reported 07 dot • 03 00 01 100000 03 10000 • dot 09 07 + 17,23 A "hard" VLSI graph (a=2, m=1910) (c) 06 14 15 An "easy" VLSI graph (a=2, m=1910) (d) Fig. 7. A comparison among the various treatments for specific graphs with a density factor of roughly 2. · M. Stallmann, F. Brglez, and D. Ghosh Average number of crossings reported 1000000 1000000 00 01 100000 07 06 15 14 03 dot • 10000 09 17 + 23 Average number of crossings reported 24 00 01 100000 03 10000 1000 1000 500 500 09 A "hard" VLSI tree (a=1, m=1910) (b) 1000000 00 01 100000 06 14 07 • dot 15 09 + 23 17 Average number of crossings reported Average number of crossings reported 1000000 10000 07 An "easy" VLSI tree (a=1, m=1910) (a) 03 06 dot • 00 01 100000 • dot 03 10000 06 07 09 17 + 14 15 23 1000 A random tree (a=1, d=infinity, m=1910) (c) Fig. 8. 1000 A random tree (a=1, d=1.1, m=1910) (d) A comparison among the various treatments for various trees. Bigraph Crossing Minimization · 25 4.3 Asymptotic Relative Performance The relative performance of various treatments on specific graphs presented up to now leads us to wonder how these differences behave as graphs grow larger. For the circuit layout applications, where millions of nodes are typical, it is important to know whether a given treatment is likely to produce competitive results with increasing size. Naturally we can only guess these results by extrapolation, but many of the trends are clear-cut and compelling. We turn our attention now to charts that show ratios between crossings numbers of two given treatments as a function of number of edges. Each curve on a chart represents a specific graph type and each point represents a single class. The y value is obtained by taking the ratio of two averages, the average crossing number reported by each treatment for the class. A key difference from earlier charts for random classes is that instead of using the isomorphism class of a single random graph (to be consistent with the isomorphism classes in the same figure), we use the 64 distinct graphs of the original random class (to make the asymptotic curves smoother). As suggested earlier — Section 3.4 — this should not matter much. The differences in performance illustrated in the previous section are supported by asymptotic trends — large differences in the performance of two treatments on a specific graph from a single class imply a steady increase in the ratio between these two treatments for the corresponding graph type, small differences remain the same or decrease with increasing size. A typical example is the ratio between tr03 and tr06 (pure barycenter versus pure bfs). Fig. 9(a) supplements the decomposition diagrams of Fig. 6. In all cases, tr03 reports fewer crossings and the ratios become smaller with increasing size. As density factor increases, the rate of decrease in the ratio slows † . A slowing down of rate of decrease or increase with increasing density factor can be observed in general for various ratios on random graphs and, of course, for the ratio tr00/tr17. This suggests that with larger sizes of random graphs, the differences in height among the decomposition diagrams in Fig. 6 would be more extreme. Ratios for communication graphs, Fig. 9(b), all favor tr03. The upward trend for grids, especially square ones, suggests, however, that larger subjects might favor tr06. Fig. 9(c) shows the ratio tr03/tr06 for various random trees. All those with d < ∞ (i.e. based on geometry) favor tr06 with the rate of increase getting bigger as d approaches 1. The non-geometric trees (d = ∞) favor tr03, as already seen in Fig. 9(a). Fig. 9(d) shows ratios for VLSI graphs. The easy ones (G00,k , G10,k , and G20,k ) all favor tr06 with dramatically increasing ratios. The hard trees exhibit almost the same behavior as random trees (this is true for most of the other asymptotic ratios between treatments), but the hard graphs with density factor 2 (G11,k ) appear to have enough structure to favor tr06. It is not clear what the hard graphs with density factor 4 will do. The ratio tr03/tr06 can be regarded as a measure of the amount of structure in a graph (and its asymptotic behavior an indication of the amount of structure in the graph type). Graphs that are more random favor iterative improvement, par† The a = 1.0 curve is for a = 1.0 and d = ∞. Ratio of CN averages: tr03/tr06 1.1 1 M. Stallmann, F. Brglez, and D. Ghosh a = 1.0 a = 1.125 a = 1.03125 a = 1.25 a = 1.0625 a = 2.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 20 100 1000 number of edges 4000 Ratio of CN averages: tr03/tr06 · 26 1.1 1 ⊗♥ 0.9 0.8 0.7 0.6 0.5 = d = 1.001 d = 2.0 d = 1.01 d = infinity 30 10 1 0.3 20 100 1000 number of edges (c) Random trees. Fig. 9. ⊗ butterflies ⊗ hypercubes = rectangular grids ♦ square grids ♥ ♦ ⊗ = =⊗ ♦⊗ ⊗ ♥ ♦ 100 edges ♥ ♥ 1000 4000 (b) Communication graphs. 4000 Ratio of CN averages: tr03/tr06 Ratio of CN averages: tr03/tr06 d = 1.1 = ♦ 0.4 0.3 10 (a) Random graphs. d = 1.0 ⊗ ♥ ♥ G_00 G_10 G_20 G_01 G_11 G_21 30 10 1 0.3 50 100 1000 2000 number of edges (d) VLSI graphs. An asymptotic comparison between pure barycenter and pure bfs. Bigraph Crossing Minimization Ratio of CN averages: tr06/tr14 d = 1.001 1000 50 tr06/tr14 (d=1.01) d = 1.1 100 27 tr06/tr14 (d=1.001) d = 1.01 d = 2.0 · tr06/tr14 (d=1.1) tr06/tr14 (d=2) 10 10 1 20 1 100 1000 4000 number of edges (a) bfs vs. gbfs as a function of size. Fig. 10. view. 1 10 100 1000 4000 "Width" of the tree (b) bfs vs. gbfs as a function of “width”. How initial placement based on gbfs improves on breadth-first search — asymptotic ticularly barycenter, while more structured ones favor initial placement, especially breadth-first search. 4.3.1 Asymptotic Behavior of Gbfs. One difference that is not reflected directly in asymptotic behavior is the advantage of gbfs over bfs for random trees with dimension ranging from 2 down to 1. Fig. 10(a) shows the curves of the ratio tr06/tr14 for various dimensions of trees. For any fixed dimension, the advantage of gbfs dwindles rapidly as size increases. The gbfs heuristic on a class of random trees of dimension 1 + α is very sensitive to what we call the width, the number of edges multiplied by α. Roughly speaking, the width of an individual tree represents the average number of edges for all paths that branch away from the longest path. Fig. 10(b) shows the relationship between width and the tr06/tr14 ratio. The same symbols as in Fig. 10(a) are used to show which tree dimension contributed each point along the curve. To get the largest two widths for d = 1.001 we generated additional tree classes with up to 15350 edges. Thus we see that the diminishing effectiveness of gbfs is really due to increasing width and width increases as we increase size while keeping dimension fixed. 4.3.2 Asymptotic Behavior of Dot. Another interesting comparison is that between dot and tr07. Both have roughly equal emphasis on initial placement versus iterative improvement, but the former uses dfs while the latter uses bfs. The random trees with d = ∞ clearly favor dot — see Figs. 11(a) and (c) — but all other random graphs give the edge to tr07. This advantage is only slight for graphs with a > 1 and it only shows itself for the largest graphs. For the geometrically-generated trees it is dramatic and the gap grows larger as d approaches 1. For example, the tr07/dot ratio goes from about 0.18 to 0.08 when d = 1.1 and the number of edges goes from 1910 to 3830 — thus the inverse dot/tr07 ratio would increase from 5.5 to 12.5. Among the isomorphism graph types only the hard VLSI trees favor dot and even more dramatically so than the random trees — see Fig. 11(d). Hypercubes also Ratio of CN averages: tr07/dot 1.8 M. Stallmann, F. Brglez, and D. Ghosh a = 1.0 a = 1.125 a = 1.03125 a = 1.25 a = 1.0625 a = 2.0 Ratio of CN averages: tr07/dot · 28 1.6 1.4 1.2 1 0.8 0.6 20 100 1000 number of edges 4000 ♥ butterflies ⊗ hypercubes 1.8 = rectangular grids 1.6 ♦ square grids 1.4 1.2 1 =⊗ 0.6 10 d = 1.1 d = 2.0 d = infinity 1.5 1 0.5 0 20 100 1000 number of edges (c) Random trees. Fig. 11. ⊗ ⊗♥ ♦ ⊗ ⊗ ♥ ♥ = ⊗ ♥ ⊗ ♦ ♥ 100 1000 4000 edges (b) Communication graphs. Ratio of CN averages: tr07/dot Ratio of CN averages: tr07/dot 2 ♥♦ = 0.8 (a) Random graphs. d = 1.0 d=1.001 and d=1.01 not shown = ♦ 4000 G_00 G_10 G_20 G_01 G_11 G_21 2 1.5 1 0.5 0 50 100 1000 2000 number of edges (d) VLSI graphs. An asymptotic comparison between bfs plus median and dot (dfs plus enhanced median). Bigraph Crossing Minimization · 29 show a slight trend in favor of dot as their dimension increases — see Fig. 11(b). Since the density factor also increases, the increase in ratio here is contrary to what one would expect for random graphs with the same a and m. In all other cases, the difference between tr07 and dot is hardly noticeable and the asymptotic trends are far from clear. The lesson here appears to be that depth-first search does reasonably well on trees that are more balanced while breadth-first search does better when most subtrees at a given node are small. When embedding a biplanar or almost biplanar tree, it is important to embed any small subtrees emanating from a node close to the node itself, while letting long paths stretch from one end to the other. Conversely, with the number of nodes divided more evenly among subtrees (the extreme case of this are the hard VLSI trees G01,k — see Fig. 16, Appendix C) the shuffling of subtrees that occurs with breadth-first search produces many unnecessary crossings. 4.4 Other Observations At the end of Section 3 we showed that sometimes the large variance of dot could be advantageous, allowing low-crossing presentations to be found in one or more of the 64 runs in an isomorphism class. Fig. 12 illustrates this with G01,7 and iso(rnd (G01,7 )), whose histograms appear in the first and last rows of Fig. 5. The main change is in the ranking (Fig. 12(a) and (b)) or relative position (Fig. 12(c) and (d)) of dot. There is also a slight decrease in the advantage of gbfs over bfs, suggesting that at least 1 out of 64 times bfs finds a presentation almost as good as that of gbfs. For some graphs bfs finds presentations better than those of gbfs. We now take a look at how the treatments we studied compare with known upper bounds. Fig. 13 illustrates the asymptotic behavior of several treatments and the known upper bounds for four graph types, two hard VLSI graph types, hypercubes, and grids (both rectangular and square). The upper bounds for the VLSI graphs come from Appendix C and the ones for grids and hypercubes come from a recent paper [Shahrokhi et al. 2000]. Only tr23 manages to keep up with the upper bound for the trees. However, the minimum number of crossings reported by dot matches our upper bound in every case (our recursive derivation is based on a depth-first search from the root of the tree). The upper bound for the VLSI graphs with a = 2 is not good at all and even some of the less powerful treatments manage to do better rather easily. The minimum number of crossings certainly grows at a much slower rate than the quadratic one suggested by our upper bound. In other words, our attempt to create a type of graphs with a relatively large crossing number was not successful. The Shahrokhi et al. bound for hypercubes is excellent in relation to our treatments, which is not surprising given that it is based on the recursive structure of the hypercube. Dot is the closest competitor. In fact, for all but the two largest cubes, the minimum crossing number reported by dot matches the upper bound. Their bound for grids appears to take better advantage of the structure of grids that are not square (square grids are marked with # in front of the number of edges) than most of our treatments. However, tr23 has the same pattern (with respect to square and rectangular grids) and does better still. Since [Shahrokhi et al. 2000] gives a lower bound that is within a constant factor of the upper bound, our results suggest that the better treatments have the same asymptotic behavior as the actual · M. Stallmann, F. Brglez, and D. Ghosh Average number of crossings reported 1000000 00 1000000 01 100000 06 15 07 14 03 dot • 10000 09 17 + 23 Minimum number of crossings reported 30 01 100000 03 10000 09 500 500 A "hard" VLSI tree (a=1, m=1910) 01 10000 06 07 15 14 09 17 + 23 1000 A random tree (m=1910, n0=382, n1=1529) (c) iso(rnd(G01,7 )), average Fig. 12. trees. Minimum number of crossings reported Average number of crossings reported 1000000 00 •dot +•dot A "hard" VLSI tree (a=1, m=1910) (b) G01,7 , minimum (a) G01,7 , average 03 17 23 1000 100000 06,15 14 07 1000 1000000 00 00 01 100000 03 10000 06 07 09 • dot 15 14 17 + 23 1000 A random tree (m=1910, n0=382, n1=1529) (d) iso(rnd(G01,7 )), minimum A comparison between minimum value reported and average value reported for two 4 3.5 3 2.5 2 tr03/tr17 Bigraph Crossing Minimization tr06/tr17 UB/tr17 tr23/tr17 dot/tr17 tr03/tr17 2.5 2 tr06/tr17 dot/tr17 tr23/tr17 UB/tr17 1.5 # 112 edges 1.5 230 edges 1 0.5 110 edges 950 edges · UB/tr17 tr23/tr17 dot/tr17 tr06/tr17 tr03/tr17 (b) Hard graphs with a = 2 (G11,k ). 2.2 2 1.8 1.6 1.4 1.2 1 0.8 52 edges (d) Grids (meshes). # 24 edges 1 ratio of mean crossing numbers ratio of mean crossing numbers 0.5 (a) Hard trees (G01,k ). tr03/tr17 tr06/tr17 dot/tr17 tr23/tr17 UB/tr17 470 edges 232 edges 470 edges 448 edges 0 1.1 1.08 1.06 1.04 1.02 1 0.98 0.96 0.94 0.92 0.9 192 edges 50 edges 10 edges 110 edges 32 edges 31 1910 edges # 1984 edges 1910 edges 2304 edges 50 edges 12 edges (c) Hypercubes. 976 edges 20 edges 4 edges Fig. 13. An asymptotic comparison between known upper bounds and average number of crossings reported by several treatments. # 480 edges 950 edges 1024 edges 230 edges 80 edges ratio of mean crossing numbers ratio of mean crossing numbers 32 · M. Stallmann, F. Brglez, and D. Ghosh minimum and approximate it reasonably well. We learned much more from the behavior of treatments on our carefully-designed classes of subjects than we would have learned examining only the results on random graphs. In a sense, the observed “behavior” of a graph type with respect to our treatments became at least as important as the behavior of the treatments. By identifying some extremes in the range of this latter behavior we hope to have reduced the effort required in judging the efficacy of a new treatment or the usefulness of a new graph type. 5. CONCLUSIONS AND FURTHER WORK This study is only the beginning of what we hope is a new approach to experimental study of bigraph crossing and other intractable problems. Input data and treated output can be shared and verified so that different groups working on the same problem can conduct repeatable experiments. Better heuristics are often developed through a detailed understanding of why specific instances present difficulties, and such understanding is made more likely when careful thought is given to the design of types of test subjects. We hope to facilitate future development of graph type design and algorithm engineering by making our experimental infrastructure available on the world-wide web — see Appendix D. The work we have done is extensive in volume and scope. Certainly there are many hypotheses beyond those we presented that can be inferred and partially or fully verified from the data already gathered. But there are also important gaps in our data and the scope of our algorithm engineering, as well as unproven conjectures about the bigraph crossing problem. The most pressing issue from a practical perspective is generalization to k > 2 layers. ∗ There are obvious generalizations for all heuristics discussed here. Initial ordering heuristics depend only on sorting each layer by sequence numbers assigned during a search. Iterative improvement heuristics repeatedly solve fixed-layer problems one layer at a time — their generalization can repeat forward and backward passes through the layers, where a forward pass sequences layer i so as to minimize crossings among edges between it and layer i − 1 and a backward one minimizes crossings among edges between layers i + 1 and i. The dot heuristic is already generalized in this fashion [E.R. Gansner, E. Koutsifios, S.C. North and K.P. Vo 1993]. Theoretical questions related to k-layer generalization include (a) is there a simple characterization of a k-planar graph, a k-layer graph that can be embedded on k horizontal lines without edge crossings; † and (b) are there types of graphs, e.g. trees, for which k-layer crossing minimization can be solved in polynomial time? Future work can be divided into three main categories: theory, algorithm engineering, and experimental design. Some ideas are listed below. Theory —Better lower bounds for specific graphs and for general sparse graphs. ∗ A k-layer graph is a k-partite graph in which all edges are incident to nodes on adjacent layers; dot converts the former into the latter by adding “virtual nodes” to edges between non-adjacent layers. † This was recently settled by [Healy et al. 2000] Bigraph Crossing Minimization · 33 —Algorithms with provable approximation bounds for special graph types (as in [Shahrokhi et al. 2000]) or general graphs. —Faster exact algorithms for trees — the results of [Shahrokhi et al. 2001] imply an O(n1.6 ) algorithm [Chung 1984; Shiloach 1979], or faster approximation algorithms — the minimum number of edges whose removal will eliminate all crossings can be found in linear time for trees [Shahrokhi et al. 2001]. —More general types of graphs, beyond trees, for which the crossing number can be found in polynomial time. ‡ —Proof of convergence for the median and barycenter heuristics (or a counterexample) — neither one is necessarily monotone with respect to crossing number. Algorithm Engineering —Better initial ordering — for example, a decomposition into biconnected components could be combined with a good heuristic or exact algorithm for trees. —Better iterative improvement — tr23 suggests that this is possible but efficiency is an important issue. —Treatments that, like simulated annealing, use randomization. —Novel treatments that stray from the standard paradigm of initial ordering and iterative improvement, or iterative improvement strategies that are not based on fixed-layer heuristics. Experimental Design —Fill in some gaps in the graph types introduced here: (a) trees with dimension d > 2, both integer and non-integer values — what is the maximum value for d before a class is indistinguishable from d = ∞? and (b) general graphs with d 6= ∞ — give nodes a random location in a metric space and choose all edges based on increasing distance, first constructing a minimum spanning tree and then adding edges to it. —Equivalence classes based on specific circuits or different schemes than the ones presented here (see, for example, [Ghosh 2000]). —Infrastructure to support participation of other researchers who wish to contribute new graph classes, new treatments, or new approaches to analyzing the data (see, for example, [Brglez et al. 2000; Brglez and Lavana 2001]) ACKNOWLEDGMENTS Some of the initial crossing number experiments with 2-layer graphs posted on the Web in 1997 were organized and executed by Nevin Kapur. He also contributed the software utilities that support gathering and tabulating data about the experiments and the statistical summaries. Hemang Lavana contributed utilities to dynamically generate web-pages of directories that link the data related to the reported experiments. We appreciate their support. Also appreciated are the extensive comments of Cathy McGeogh, our editor, and helpful suggestions by the referees and by Imrich Vrt́o. ‡ One simple case is that of bipartite permutation graphs [Spinrad et al. 1987]. 34 · M. Stallmann, F. Brglez, and D. Ghosh APPENDIX A. COMPARISONS WITH PREVIOUSLY PUBLISHED RESULTS There are four main differences between the way we conducted our experiments and previous results of [Jünger and Mutzel 1997] or the experiments on the sifting technique [Matuszewski et al. 1999; Schönfeld et al. 2000]. We focus our discussion on the Jünger-Mutzel results since these are used as a benchmark for the others. (1) Our random graphs are forced to be connected (we generate a random spanning tree before distributing the remaining edges at random). As already pointed out in Section 3 — see Table 4 — graphs generated using the standard random graph model behave, with respect to crossing number, as if they were denser than they really are. (2) Since there are no published random number seeds for previous experiments, the results cannot be duplicated exactly — for example, as a test of the accuracy of the implementations. In most cases our own results, including our attempted recreation of the relevant experiments in the Jünger-Mutzel paper, were only slightly sensitive different seeds. The only exception were graphs with 10 nodes on each layer and 10% density [Jünger and Mutzel 1997, Table 4], where most of the statistics appear to be meaningless. Most heuristics reported 0 crossings more than half the time, so the value of the mean was a matter of chance more than anything else. (3) Our implementations of both median and barycenter use stable sorts and we keep iterating until there is no change at all in the positions of nodes. We also implemented versions of both median and barycenter that stop iterating when there is no improvement after the heuristic is applied to both sides (tr24 and tr25 — see Section 2). In these cases there is no need for the sort to be stable — we used an unstable sort because it yielded better results. Table 2 shows a comparison of our implementations with the ones used in other experiments. (4) It is not clear whether the median heuristic used in the Jünger-Mutzel paper is the one from [Eades and Wormald 1994] or uses the standard definition of the median of an even-cardinality set — the mean of the two middle items. Our attempts to recreate the experiments — see Tables 5 and 6 suggest the latter (tr30 uses the mean of two middle items) even though the original paper suggests the former [Jünger and Mutzel 1997; Mutzel 2001]. The exact nature of the sorting algorithm might also have an effect. We used insertion sort in all cases, which meant that our unstable sort reversed the order of nodes with equal keys. Jünger and Mutzel, on the other hand, used priority queues as implemented in LEDA [Mehlhorn and Näher 1999]. These lead to an unstable sort, but not necessarily to the reversal of items with equal keys. Tables 5 and 6 show our results as compared to those of Tables 4 and 6, respectively in [Jünger and Mutzel 1997]. ∗ Data for this number of edges is extremely sensitive to the random number seed. 100 instances generated with a different seed yielded intervals [0.14,0.39] for tr23, [0.81,1.43] for tr25, and [1.40,2.00] for tr30. Bigraph Crossing Minimization Heuristic minimum tr23 barycenter tr25 median tr30 tr24 m = 10 ∗ 0.29 [0.04,0.16] 1.52 [0.60,1.14] 1.53 [1.26,1.86] [1.03,1.75] Heuristic minimum tr23 barycenter tr25 median tr30 tr24 m = 20 11.62 [10.6,12.1] 18.78 [14.8,17.2] 24.08 [19.7,22.3] [19.1,22.3] m = 60 463.17 [456,465] 475.52 [462,475] 539.59 [487,502] [488,504] m = 30 56.60 [58.1,61.6] 65.30 [63.5,68.5] 81.78 [71.4,76.6] [74.1,80.4] m = 70 698.35 [699,707] 710.88 [704,715] 782.33 [739,756] [727,744] m = 40 146.89 [145,149] 157.70 [152,158] 189.55 [165,174] [167,176] m = 80 1008.38 [1000,1010] 1021.44 [1000,1020] 1110.39 [1050,1070] [1038,1053] · 35 m = 50 276.78 [277,283] 288.15 [284,292] 333.25 [306,318] [303,315] m = 90 1405.57 [1410,1420] 1420.68 [1420,1430] 1524.18 [1440,1460] [1427,1440] The graphs yielding the above results all have n0 = n1 = 10. 100 instances for each m are generated randomly using standard techniques for random bipartite graphs, like those of, for example, Knuth’s GraphBase [Knuth 1993]. Numbers reported for the minimum, median, and barycenter are the means from Table 4 of [Jünger and Mutzel 1997]. Our data for tr23 (the best results obtained), tr24, tr25, and tr30 are 95% confidence intervals of the mean rounded to 3 digits of accuracy. Table 5. Comparison of crossing number mean values reported by Jünger and Mutzel for random graphs of various densities with our results. B. PROOF THAT DIMENSION-1 TREES ARE BIPLANAR TREES Recall that a tree has dimension d = 1 if it is generated in the following way. Assign each node to either V0 or V1 . Assign a real number xi ∈ (0, 1] to each node i. Let the distance between nodes i and j be ∞ if {i, j} ⊆ V` for ` = 0 or 1 and |xi − xj | otherwise. Construct a minimum spanning tree using the distance just defined. Definition 1. A comb is a tree that becomes a path if all the leaves are removed. Fig. 14 shows a comb; the thicker path in the diagram is the spine, the path that remains when leaves have been deleted. A connected bigraph is biplanar if and only if it is a comb [Harary and Schwenk 1971]. z2 z4 z6 s s s × × A A ¢A ¢@ ¢ A ¢ A@ A ¢ ¢ A @ A A ¢ A @ A ¢ A s¢ × A× @s A¢s As × × z1 Fig. 14. z3 z5 z7 An example of a comb — the spine is along the thicker line, z1 . . . z7 . 36 · M. Stallmann, F. Brglez, and D. Ghosh Heuristic tr23 barycenter tr25 median tr30 tr24 m = 20 [11.7,13.3] 15.70 [17.1,19.8] 25.70 [21.6,24.8] [19.8,23.0] m = 40 [45.4,50.4] 72.50 [67.5,75.1] 79.60 [85.8,94.2] [94.4,104] m = 60 [103,112] 147.90 [152,165] 188.50 [189,203] [233,249] m = 80 [179,193] 273.30 [259,281] 374.20 [334,359] [425,449] m = 100 [279,297] 392.30 [400,429] 561.90 [518,552] [658,705] Heuristic m = 120 m = 140 m = 160 m = 180 m = 200 tr23 [409,436] [548,581] [697,736] [904,953] [1100,1160] barycenter 567.00 764.60 1080.70 1272.40 1555.10 tr25 [572,612] [765,819] [957,1020] [1250,1320] [1520,1600] median 811.20 1146.20 1481.30 1848.00 2084.10 tr30 [757,802] [1020,1070] [1310,1380] [1690,1770] [2050,2140] tr24 [1000,1050] [1320,1380] [1750,1840] [2240,2340] [2750,2870] The graphs yielding the above results all have n0 = n1 = m/2. 100 instances for each m are generated randomly using standard techniques for random bipartite graphs. Numbers reported for the median and barycenter are the means from Table 7 of [Jünger and Mutzel 1997]. Our data for tr23 (the best results obtained), tr25, and tr30 are 95% confidence intervals of the mean rounded to 3 digits of accuracy. Table 6. Comparison of crossing number mean values reported by Jünger and Mutzel for sparse random graphs of various sizes with our results. Lemma 1. A tree has dimension d = 1 if and only if it is biplanar (a comb). Proof. (⇐=) Suppose that T is a comb with nodes z1 , . . . , zk along the spine (as in Fig. 14). Assign the real number i/k to zi and numbers in the open interval i+0.5 ( i−0.5 k , k ) to the non-spine neighbors of zi . It is easy to see that the minimum spanning tree, restricted to obey the bipartition, will be T . (=⇒) Suppose T is dimension-1 and place all nodes of T on the x-axis according to the real number assigned to them (assume wlog that no two nodes occupy the same spot). Move the nodes of V1 vertically to y = 1 and draw the edges of T as straight lines. Suppose the edges a1 b1 and a2 b2 , both in T , cross (the a’s are in V0 ) and assume wlog that a1 is the leftmost node (in either V0 or V1 ) on any edge that crosses — otherwise the proof is symmetric with b2 playing the role of a1 . Adding a1 b2 to T creates a cycle C. Let a1 b0 be the other edges of C incident with a1 . Depending on the position of b0 we contradict either the fact that T is a restricted minimum spanning tree or the assumption about a1 . See Fig. 15. b0 is to the left of b2 . In order for C to reconnect with b2 there must be another crossing involving an edge with an endpoint to the left of a1 , contradicting our assumption about a1 . b0 is to the right of b2 . Since b2 is to the right of a1 by assumption, b0 is farther from a1 than b2 . This means T ∪{a1 b2 }−{a1 b0 } has less weight than T , contradicting the fact that T is a minimum spanning tree. b0 is b2 , i.e. a1 b2 is an edge of T . Add the edge a2 b1 to T (it cannot be in T already) to form the cycle a1 b1 a2 b2 . Depending on the positions of a2 and b1 , Bigraph Crossing Minimization b0 s· · ·· b2 b1 s ·s ··· · ·´ A · · · · · · · · ··¢·A·· · · · · · · ·´ ¢ A ´ A ¢ ´ A ´A A A ¢´ s¢ As A´ a1 a2 (a) b0 to the left of b2 . Fig. 15. b2 b0 s s ¢@ ¡ ¢ ¡ @ ¢ ¡ @ ¢¡ @ s¢¡ @s a1 a2 b0 = b 2 · 37 b1 s ¾ s¢@ ¡A ¢ ¡ @ A ¢ ¡ @A ¢¡ @A s¢¡ @ As a1 (b) b0 to the right of b2 . a2 (c) b0 is b2 . Cases in the proof that a dimension-1 tree is biplanar. either T ∪ {a2 b1 } − {a2 b2 } or T ∪ {a2 b1 } − {a1 b1 } will have less weight than T , contradicting the fact that T is a minimum spanning tree. The proof suggests that, with an appropriate definition of what it means for all biplanar trees to be equally likely, it may be possible to argue that a random dimension-1 tree with a given n0 , n1 , and m is equally likely to be any of the biplanar trees with that n0 , n1 , and m. The structure of the tree is more dependent on the assignment of nodes to V0 or V1 than to the positions of nodes on the line. Without that a priori assignment, the tree will always be a path. C. DETAILS ON THE CONSTRUCTION OF VLSI GRAPHS Our goal: to construct bigraph types (i.e. collections of classes having common characteristics other than size and whose sizes grow exponentially) similar to VLSI circuits with varying densities and difficulty. Recall that the set V0 represents cell nodes. In many applications these all have the same degree — we chose 5 to allow for density factors up to 4. The set V1 consists of net nodes. Each net node is adjacent to a group of cell nodes that are to be connected in a circuit. The six bigraph types have three density factors (1, 2, and 4), and two “degrees of difficulty” (easy, meaning relatively few crossings, and hard, a larger number of crossings, but also with the intent of stumping our heuristics). Easy instances of trees are trivial to create. Just make them biplanar — we connect each cell node to 5 net nodes and share a net node between each two neighboring cell nodes, as illustrated in Fig. 3(a), Section 3.3. The result has no crossings. To create hard instances of trees, the G01,k graphs, we let n = n0 = 3 · 2k − 3, V0 = {a0 , . . . , an−1 }, V1 = {b0 , . . . , b4n }, and E = {ai b4i+j | 0 ≤ i ≤ n − 1, 1 ≤ j ≤ 4} ∪ {a0 b0 } ∪ {ai bi | 1 ≤ i ≤ 3} ∪ {ai b2i−3 | 4 ≤ i ≤ n − 1}. Fig. 16 shows how G01,2 looks when made to look like a binary tree with extra leaves. Fig. 3(b), Section 3.3, shows an embedding with 28 crossings. Using a straightforward recursive construction, we know the number of crossings k k for G01,k to be no more than 15 2 k · 2 − 10 · 2 + 10, hence O(n log n). This exact bound is clearly not optimal, given the embedding with 28 crossings for k = 2; whether n log n is the right asymptotic bound is unclear. Denser graphs are created by decreasing the number of net (V1 ) nodes. The easier variants generalize the structure of a comb with a sequence of overlapping complete bipartite graphs, called clusters: a k0 ×k1 cluster has k` layer-` nodes and all possible 38 · M. Stallmann, F. Brglez, and D. Ghosh a0 b0 b2 b1 a1 b5 a4 b6 b7 a5 b3 a2 b8 b9 a3 b10 a6 b11 a7 b17 18 19 20 21 22 23 24 25 26 27 28 Fig. 16. b4 b12 b13 a8 b14 b15 b16 a9 29 30 31 32 33 34 35 36 37 38 39 40 The tree G01,2 . edges between µ ¶µ ¶ them — the minimum number of crossings for such a cluster is k0 k1 . In G10,k graphs, 2 × 5 clusters (forcing 10 crossings each) overlap at 2 2 two net nodes on each end; the overlap forms a 4 × 2 cluster that yields 4 additional crossings (1 crossing on each end only involves edges belonging to the same 2 × 5 cluster — otherwise the contribution of a 4 × 2 cluster would be 6 crossings). If c = n0 , the number of cell nodes, there are 3c 2 + 2 net nodes. Numbering the node sets a0 , . . . and b0 , . . . as before, the edges are {ai b3i/2+j | 0 ≤ i ≤ c−1, 0 ≤ j ≤ 4}. The minimum number of crossings is easily calculated: 7c−4. Fig. 4(a), Section 3.3, shows the smallest instance. To increase the density again, we reduce the number of net nodes to c−2 4 + 5. A similar overlap strategy to the one used for G10,k has a sequence of 4 × 5 clusters overlapping at 4 net nodes. Counting the number of crossings is complicated by the fact that a cluster overlaps at 3 net nodes with the second cluster after it and at 2 net nodes with the third one (overlapping at one node induces no new crossings). In general, the minimum number of crossings is 55c − 260, but k = 2 (c = 10) is a special case because some potential overlaps are missing: 298 is the optimum in that case. To increase difficulty — and number of crossings — for each of the two higher densities, only four of the five edges incident on each cell node are used in an overlapping cluster construction. The fifth edge is used to connect to a net node that also connects to another net node “as far away as possible” in the cluster sequence. For G11,k a sequence of 2 × 4 clusters overlap at 2 net nodes forming 4 × 2 clusters at the overlap. Roughly one third of the net nodes are reserved for the far-away connections. The edges are {ai bi+j | 0 ≤ i ≤ c − 1, 0 ≤ j ≤ 2} ∪ {ai bc+2+i mod c/2 | 0 ≤ i ≤ c − 1} ∪ {ai bi+3 | i is even } ∪ {ai bi−1 | i is odd }. See Fig. 4(b), Section 3.3. If the cell nodes are put in numerical sequence and the net nodes sequenced according to a barycenter pass (as shown in the figure), the number of crossings is m2 /20 + 7m/10 − 4. Finally the G21,k graphs have overlapping 8 × 4 clusters (3 × 8 then 2 × 8 at the overlap). Half of the roughly c/4 net nodes are used for clusters, the other Bigraph Crossing Minimization · 39 half for long-distance connections. Let c = n0 as before and let h = (c − 6)/8 + 2 (when k ≥ 2, the number of cell nodes = 6 mod 8). The edges are then defined as {ai bi/8+j | 0 ≤ i ≤ c, 0 ≤ j ≤ 3} ∪ {ai bh+2+(i/4) mod h | 0 ≤ i ≤ c}. D. DATA AND PROGRAMS AVAILABLE ON THE WEB Data and programs used in our experiments are available on the Web at URL www.cbl.ncsu.edu/software/2001-JEA-Software/. A unique feature of our experimental design is the separation of the experimental software into three parts: (1) The generation of experimental data: Programs and scripts are provided for generating both random and isomorphic graph classes, those reported in the paper and custom-designed ones for which density factor, dimension, and sizes are specified. (2) The execution of heuristics: There is a program that selectively runs any of the treatments reported in the paper and scripts for running one or more treatments on a set of graph classes specified by Unix wild-card conventions. (3) The evaluation of the cost function and compilation of results: The separation of execution and evaluation makes it possible for other researchers to run their own heuristics on our data, submit new data to our heuristics, and evaluate any bigraph presentation independent of the heuristic that generated it. Various scripts are available for gathering summary statistics, calculating confidence intervals, and putting comparative results in tabular form. All software is clearly documented and a tutorial example is provided to illustrate the use of the main scripts. REFERENCES Betz, V. and Rose, J. 1997. VPR: A New Packing, Placement and Routing Tool for FPGA Research. In Proceedings of the 7th International Workshop on Field-Programmable Logic (August 1997), pp. 213–222. Software and postscript of paper can be downloaded from http://www.eecg.toronto.edu/~vaughn/vpr/vpr.html . Brglez, F. and Lavana, H. 2001. A Universal Client for Distributed Networked Design and Computing. In Proceedings of the 38th Design Automation Conference (June 2001). Also available at http://www.cbl.ncsu.edu/publications/#2001-DAC-Brglez. Brglez, F., Lavana, H., Ghosh, D., Allen, B., Casstevens, R., III, J. H., Kurve, R., Page, S., and Stallmann, M. 2000. OpenExperiment: A Configurable Environment for Collaborative Experimental Design and Execution on the Internet . Technical Report 2000-TR@CBL-02-Brglez (March), CBL, CS Dept., NCSU, Box 8206, Raleigh, NC 27695. Chung, F. R. K. 1984. On optimal linear arrangement of trees. Comp. and Maths. with Appls. 10, 1, 43–60. Di Battista, G., Eades, P., Tamassia, R., and Tollis, I. G. 1999. Graph Drawing: Algorithms for the Visualization of Graphs. Prentice Hall. Eades, P. and Kelly, D. 1986. Heuristics for reducing crossings in 2-layered networks. Ars Combinatoria 21-A, 89–98. Eades, P., McKay, B. D., and Wormald, N. C. 1976. On an Edge Crossing Problem. Technical report, Department of Computer Science, University of Newcastle, New South Wales 2308, Australia. Eades, P. and Sugiyama, K. 1990. How to Draw a Directed Graph. Journal of Information Processing 13, 424–437. Eades, P. and Whiteside, S. 1994. Drawing graphs in two layers. Theoretical Computer Science 131, 361–374. 40 · M. Stallmann, F. Brglez, and D. Ghosh Eades, P. and Wormald, N. C. 1994. Edge Crossings in Drawings of Bipartite Graphs. Algorithmica 11, 379–403. E.R. Gansner, E. Koutsifios, S.C. North and K.P. Vo. 1993. A Technique for Drawing Directed Graphs. IEEE Trans. Software Engg. 19, 214–230. The drawing package dot is available from http://www.research.att.com/sw/tools/graphviz/. Garey, M. R. and Johnson, D. S. 1983. Crossing Number is NP-complete. SIAM J. Algebraic Discrete Methods 4, 312–316. Ghosh, D. 2000. Generation of Tightly Controlled Equivalence Classes for Experimental Design of Heuristics for Graph–Based NP-hard Problems. Ph. D. thesis, Electrical and Computer Engineering, North Carolina State University, Raleigh, N.C. Also available at http://www.cbl.ncsu.edu/publications/#2000-Thesis-PhD-Ghosh. Ghosh, D. and Brglez, F. 1999. Equivalence Classes of Circuit Mutants for Experimental Design. In IEEE 1999 International Symposium on Circuits and Systems – ISCAS’99 (May 1999). A reprint also accessible from http://www.cbl.ncsu.edu/experiments/DoEArchives/I999-ISCAS. Ghosh, D., Brglez, F., and Stallmann, M. 1998a. First steps towards experimental design in evaluating layout algorithms: Wire length versus wire crossing in linear placement optimization. Technical Report 1998-TR@CBL-11-Ghosh (October), CBL, CS Dept., NCSU, Box 7550, Raleigh, NC 27695. Also available at http://www.cbl.ncsu.edu/publications/#1998-TR@CBL-11-Ghosh. Ghosh, D., Brglez, F., and Stallmann, M. 1998b. Hypercrossing Number: A New and Effective Cost Function for Cell Placement Optimization. Technical Report 1998-TR@CBL12-Ghosh (December), CBL, CS Dept., NCSU, Box 7550, Raleigh, NC 27695. Also available at http://www.cbl.ncsu.edu/publications/#1998-TR@CBL-12-Ghosh. Ghosh, D., Kapur, N., Harlow, J. E., and Brglez, F. 1998. Synthesis of Wiring Signature-Invariant Equivalence Class Circuit Mutants and Applications to Benchmarking. In Proceedings, Design Automation and Test in Europe (Feb 1998), pp. 656–663. Also available at http://www.cbl.ncsu.edu/publications/#1998-DATE-Ghosh. Harary, F. and Schwenk, A. J. 1971. Trees with Hamiltonian Square. Mathematika 18, 138–140. Harary, F. and Schwenk, A. J. 1972. A New Crossing Number for Bipartite Graphs. Utilitas Mathematica 1, 203–209. Healy, P., Kuusik, A., and Leipert, S. 2000. Characterization of level non-planar graphs by minimal patterns. In COCOON , Number 1858 in LNCS (2000), pp. 74–84. Jünger, M. and Mutzel, P. 1997. 2–Layer Straightline Crossing Minimization: Performance of Exact and Heuristic Algorithms. Journal of Graph Algorithms and Applications (JGAA) 1, 1, 1–25. K. Kozminski, (Ed.). 1992. OASIS2.0 User’s Guide. MCNC, Research Triangle Park, N.C. 27709. (Over 600 pages, distributed to over 60 teaching and research universities worldwide). Kapur, N., Ghosh, D., and Brglez, F. 1997. Towards A New Benchmarking Paradigm in EDA: Analysis of Equivalence Class Mutant Circuit Distributions. In ACM International Symposium on Physical Design (April 1997). Kernighan, B. W. and Lin, S. 1970. An efficient heuristic procedure for partitioning graphs. Bell System Technical Journal, 291 – 307. Knuth, D. E. 1993. The Stanford Graphbase. Addison Wesley. Leighton, F. T. 1984. New Lower Bound Techniques for VLSI. Math. Systems Theory 17, 47–70. Leighton, F. T. 1992. Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes. Morgan Kaufmann. Mäkinen, E. 1989. A Note on the Median Heuristic for Drawing Bipartite Graphs. Fundamenta Informaticae 12, 563–570. Mäkinen, E. 1990. Experiments on drawing 2-level hierarchical graphs. International Journal of Computer Mathematics 37, 129–135. Bigraph Crossing Minimization · 41 Marek-Sadowska, M. and Sarrafzadeh, M. 1995. The Crossing Distribution Problem. IEEE Transactions on Computer–Aided Design of Integrated Circuits and Systems 14, 4 (April), 423–433. Matuszewski, C., Schönfeld, R., and Molitor, P. 1999. Using sifting for k-layer straightline crossing minimization. In Proc. 7th Graph Drawing Conference, Number 1731 in LNCS (1999), pp. 217–224. Mehlhorn, K. and Näher, S. 1999. LEDA: A Platform for Combinatorial and Geometric Computing. Cambridge University Press. Mutzel, P. 1998. The AGD-Library: Algorithms for Graph Drawing. Available from http://www.mpi-sb.mpg.de/~ mutzel/dfgdraw/agdlib.html. Mutzel, P. 2001. Personal communication. N. Kapur. 1998. Cell Placement and Minimization of Crossing Numbers. Master’s thesis, Electrical and Computer Engineering, North Carolina State University, Raleigh, N.C. Also available at http://www.cbl.ncsu.edu/publications/#1998-Thesis-MS-Kapur. Palmer, E. M. 1985. Graphical Evolution: An Introduction to the Theory of Random Graphs. John Wiley and Sons. Schönfeld, R., Günter, W., Becker, B., and Molitor, P. 2000. K-layer straightline crossing minimization by soeeding up sifting. In Proc. 8th Graph Drawing Conference, Number 1984 in LNCS (2000), pp. 253–258. Shahrokhi, F., Sýkora, O., Székely, L. A., and Vrt́o, I. 2000. A new lower bound for bipartite crossing number with applications. Theoretical Computer Science. To appear. Shahrokhi, F., Sýkora, O., Székely, L. A., and Vrt́o, I. 2001. On bipartite drawing and the linear arrangement problem. SIAM J. Computing. To appear. Preliminary version was published in WADS’97. Shiloach, Y. 1979. A minimum linear arrangement algorithm for undirected trees. SIAM J. Computing 8, 1, 15–32. Spinrad, J., Brandstädt, A., and Stewart, L. 1987. Bipartite permutation graphs. Discrete Applied Mathematics 19, 279–292. Stallmann, M., Brglez, F., and Ghosh, D. 1999a. Evaluating Iterative Improvement Heuristics for Bigraph Crossing Minimization. In IEEE 1999 International Symposium on Circuits and Systems – ISCAS’99 (May 1999). A reprint also accessible from http://www.cbl.ncsu.edu/publications/#I999-ISCAS-Stallmann. Stallmann, M., Brglez, F., and Ghosh, D. 1999b. Heuristics and Experimental Design for Bigraph Crossing Number Minimization. In Algorithm Engineering and Experimentation (ALENEX’99), Number 1619 in Lecture Notes in Computer Science (1999), pp. 74–93. Springer Verlag. Also available at http://www.cbl.ncsu.edu/publications/#1999-ALENEX-Stallmann. Thompson, C. D. 1979. Area–Time complexity for VLSI. In Proceedings, 11th Annual ACM Symposium on Theory of Computing (May 1979), pp. 81–88. Tutte, W. T. 1960. Convex Representations of Graphs. Proc. London Math. Soc. 10, 304– 320. Tutte, W. T. 1963. How to Draw a Graph. Proc. London Math. Soc. 13, 743–768. Warfield, J. N. 1977. Crossing Theory and Hierarchy Mapping. IEEE Transactions on Systems, Man, and Cybernetics SMC–7, 7 (July), 505–523. West, D. B. 1996. Introduction to Graph Theory. Prentice Hall. Yamaguchi, A. and Sugimoto, A. 1999. An approximation algorithm for the two-layered graph drawing problem. In COCOON , Number 1627 in LNCS (1999), pp. 81–91.
© Copyright 2026 Paperzz