SUBMITTED TO IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE ON 28 AUG 2014 1 Reconstructing Curvilinear Networks using Path Classifiers and Integer Programming Engin Türetken, Fethallah Benmansour, Bjoern Andres, Hanspeter Pfister, Senior Member, IEEE, and Pascal Fua, Fellow, IEEE Abstract—We propose a novel Bayesian approach to automated delineation of curvilinear structures that form complex and potentially loopy networks. By representing the image data as a graph of potential paths, we first show how to weight these paths using discriminatively-trained classifiers that are both robust and generic enough to be applied to very different imaging modalities. We then present an Integer Programming approach to finding the optimal subset of paths, subject to structural and topological constraints that eliminate implausible solutions. Unlike earlier approaches that assume a tree topology for the networks, ours explicitly models the fact that the networks may contain loops, and can reconstruct both cyclic and acyclic ones. We demonstrate the effectiveness of our approach on a variety of challenging datasets including aerial images of road networks and micrographs of neural arbors, and show that it outperforms state-of-the-art techniques. Index Terms—curvilinear networks, tubular structures, curvilinear structures, automated reconstruction, integer programming, path classification, minimum arborescence. F 1 I NTRODUCTION N ETWORKS of curvilinear structures are pervasive both in nature and man-made systems. They appear at all possible scales, ranging from nanometers in Electron Microscopy image stacks of neurons to petameters in darkmatter arbors binding massive galaxy clusters. Modern acquisition systems can produce vast amounts of such imagery in a short period of time. However, despite the abundance of available data and many years of sustained effort, full automation remains elusive when the image data is noisy and the structures exhibit complex morphology. As a result, practical systems require extensive manual intervention that is both time-consuming and tedious. For example, in the DIADEM challenge to map nerve cells [2], the results of the finalist algorithms, which proved best at tracing axonal and dendritic networks, required substantial time and effort to proofread and correct [3]. Such editing dramatically slows down the analysis process and makes it impossible to exploit the large volumes of imagery being produced for neuroscience and medical research. Part of the problem comes from the fact that existing techniques rely mostly on weak local image evidence, and employ greedy heuristics that can easily get trapped in local • Engin Türetken and Pascal Fua are with the Computer Vision Laboratory, I&C Faculty, EPFL, Lausanne CH-1015, Switzerland. E-mail: [email protected], [email protected] • Fethallah Benmansour is with Roche Pharma Research and Early Development, Roche Innovation Center, Basel, Switzerland. E-mail: [email protected] • Bjoern Andres is with the School of Engineering and Applied Sciences, Harvard University, Cambridge, US. E-mail: [email protected] • Hanspeter Pfister is with the Visual Computing Group, Harvard University, Cambridge, US. E-mail: [email protected] minima. As a result, they lack robustness to imaging noise and artifacts, which are inherent to microscopy and natural images. Another common issue is that curvilinear networks are usually treated as tree-like structures without any loops. In practice, however, many interesting networks, such as those formed by roads and blood vessels depicted by Fig. 1, are not trees since they contain cycles. In the latter case, the cycles are created by circulatory anastomoses or by capillaries connecting arteries to veins. Furthermore, even among those that really are trees, such as the neurites of Fig. 1, the imaging resolution is often so low that the branches appear to cross, thus introducing several spurious cycles that can only be recognized once the whole structure has been recovered. In fact, this is widely reported as one of the major sources of error [4], [5], [6], [7], [8], [9] and a number of heuristics have been proposed to avoid spurious connections [5], [7], [8]. In this paper, we attempt to overcome these limitations by formulating the reconstruction problem as one of solving an Integer Program (IP) on a graph of potential tubular paths. We further propose a novel approach to weighting these paths using discriminatively-trained path classifiers. More specifically, we first select evenly spaced voxels that are very likely to belong to the curvilinear structures of interest. We treat them as vertices of a graph and connect those that are within a certain distance of each other by geodesic paths [10], whose quality is assessed by an original classification approach. This is in contrast to traditional approaches to scoring paths by integrating pixel values along their length, which often fails to adequately penalize short-cuts and makes it difficult to compute commensurate scores for paths of different lengths. We then look for a subgraph that maximizes a global objective function that combines both image-based and geometry-based terms with respect SUBMITTED TO IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE ON 28 AUG 2014 Road Networks (a) Aerial Blood Vessels (b) Confocal Neurons (c) Brightfield Axons (d) Confocal 2 Dendrites and Axons (e) Brainbow Fig. 1. 2D and 3D datasets used to test our approach. (a) Aerial image of roads. (b) Maximum intensity projection (MIP) of a confocal image stack of blood vessels, which appear in red. (c) Minimum intensity projection of a brightfield stack of neurons. (d) MIP of a confocal stack of olfactory projection axons. (e) MIP of a brainbow [1] stack. As most images in this paper, they are best visualized in color. 2 (a) (b) (c) (d) Fig. 2. Enforcing tree topology results in creation of spurious junctions. (a) Image with two crossing branches. The dots depict sample points (seeds) on the centerlines found by maximizing a tubularity measure. In 3D, these branches may be disjoint but the z-resolution is insufficient to see it and only a single sample is found at their intersection, which we color yellow. (b) The samples are connected by geodesic paths to form a graph. (c,d) The final delineation is obtained by finding a subgraph that minimizes a global cost function. In (c), we prevent samples from being used more than once, resulting in an erroneous delineation. In (d), we allow the yellow point to be used twice, and penalize creation of early terminations and spurious junctions, which yields a better result. to which edges of the original graph are active. Unlike earlier approaches that aim at building trees, we look for a potentially loopy subgraph while penalizing the formation of spurious junctions and early branch terminations. This is done by letting graph vertices be used by several branches, thus allowing cycles as shown in Fig. 2, but introducing a regularization prior and structural constraints to limit the number of branchings and terminations. Our approach was first introduced in [11] and [12], both of which produce connected reconstructions. We combine them here and propose new connectivity constraints along with a new optimization strategy to impose them, which result in significant speed-ups. Furthermore, we provide a more thorough validation with new experiments and comparisons to several other methods. We demonstrate the effectiveness of our approach on a wide range of noisy 2D and 3D datasets, including aerial images of road networks, and micrographs of neural and vascular arbors depicted in Fig. 1. We show that it consistently outperforms state-of-the-art techniques. R ELATED W ORK Most delineation techniques and curvilinear structure databases use a sequence of cylindrical compartments as a representation of curvilinear structures [13], [14], [15], [16], [17], [18], [11], [19]. Besides being compact, this geometric representation captures two important properties, namely connectedness and directedness, which are common among many structures such as blood vessels and neurons and are essential in studying their morphology. With respect to the required level of user input, current approaches can be roughly categorized into three major classes; manual, semi-automated and automated. 2.1 Manual and Semi-Automated Methods Manual delineation tools require the cylindrical compartments to be sequentially provided by the user starting from the structure roots, such as soma for neural micrographs or optic disc for fundus scans, and ending at branch tips [20], [21], [15], [22]. Since they are quite labor-intensive and time consuming, these tools are most often used for correcting the reconstructions obtained by more automated ones. Semi-automated techniques, on the other hand, use a tubularity measure to reduce user input to a handful of carefully selected seed points to be manually specified. These points are linked with paths that tightly follow high tubularity pixels in the image. Most methods employ an interactive and sequential procedure, where the user specifies one seed point at a time and the algorithm constructs a high-tubularity path that links the given seed to the current reconstruction [23], [24], [16], [15]. Other techniques include those that prompt the user for a new seed only when the algorithm is stuck [25] or require only the root and branch endpoints [26], [17], [22]. In Appendix A, we present an incomplete list of most of the existing software tools summarizing several key features such as the tubularity measure used and the level of automation. 2.2 Automated Methods There has recently been a resurgence of interest in automated delineation techniques [27], [28], [29] because extracting curvilinear structures automatically and robustly is of fundamental relevance to many scientific disciplines. SUBMITTED TO IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE ON 28 AUG 2014 Most automated approaches require a seed point, the root, to be specified for each connected network in the image. These points can be obtained manually, or generated automatically by finding the local maxima of the tubularity measure or using specialized detectors [30], [31]. Starting from the roots, they then grow branches that follow local maxima of the tubularity measure. Depending on the search mechanism employed, existing algorithms can be categorized as either local or global. Local search algorithms make greedy decisions about which branches to keep in the solution based on local image evidence. They include methods that segment the tubularity image, such as thinning-based methods, which then perform skeletonization of this segmentation [32], [13], [33], [34], [35], [36], and active contour-based methods, which are initialized from it [37], [38], [5]. Such methods are shown to be effective and efficient when a very good segmentation can be reliably obtained. In practice, however, this is hard to do. In particular, thinning-based methods often produce disconnected components and artifacts on noisy data, which then require considerable post-processing and analysis to merge into a meaningful network. Another important class of local approaches involves explicitly delineating the curvilinear networks. It includes greedy tracking methods that start from a set of seed points and incrementally grow branches by evaluating a tubularity measure [39], [40], [41], [6], [42], [9]. High tubularity paths are then iteratively added to the solution and their end points are treated as the new seeds from which the process can be restarted. These techniques are computationally efficient because the tubularity measure only needs to be evaluated for a small subset of the image volume in the vicinity of the seeds. However, these approaches typically require separate procedures to detect branching points. Furthermore, due to their greedy nature, they lack robustness to imaging artifacts and noise, especially when there are large gaps along the filaments. As a result, local tracing errors in the growing process may propagate and result in large morphological mistakes. Global search methods aim to achieve greater robustness by computing the tubularity measure everywhere. Although this is more computationally demanding, it can still be done efficiently in Fourier space or using GPUs [43], [44], [45]. They then optimize a global objective function over a graph of high-tubularity seed points [8] or superpixels [46]. Markov Chain Monte Carlo algorithms, for instance, belong to this class [47], [48]. These methods explore the search space efficiently by first sampling seed points and linking them, and then iteratively adjusting their positions and connections so as to minimize their objective function. However, while they produce smooth tree components in general, the algorithms presented in [47], [48] do not necessarily guarantee spatial connectedness of these components. By contrast, many other graph-based methods use the specified roots to guarantee connectivity. They sample seed points and connect them by paths that follow local maxima of the tubularity measure. This defines a graph whose vertices are the seeds and edges are the paths linking them. 3 The edges of the graph form an overcomplete representation of the underlying curvilinear structures and the final step is to build a solution by selecting an optimal subset of these candidate edges. Many existing approaches weight the edges of this graph and pose the problem as a minimum-weight tree problem, which is then solved optimally. Algorithms that find a Minimum Spanning Tree (MST) [49], [14], [7], [18] or a Shortest Path Tree (SPT) [17] belong to this class. Although efficient polynomial-time algorithms exist for both SPTand MST-based formulations, these approaches suffer from the fact that they must span all the seed points, including some that might be false positives. As a result, they produce spurious branches when seed points that are not part of the tree structure are mistakenly detected, which happens all too often in noisy data. Although suboptimal postprocessing procedures can potentially eliminate some of the spurious branches, they cannot easily cope with other topological errors such as gaps. Our earlier k-Minimum Spanning Tree (k-MST) formulation [8] addressed this issue by posing the problem as one of finding the minimum cost tree that spans only an a priori unknown subset of k seed points. However, the method relies on a heuristic search algorithm and a dual objective function, one for searching and the other for scoring, without guaranteeing the global optimality of the final reconstruction. Furthermore, it requires an explicit optimization over the auxiliary variable k, which is not relevant to the problem. By contrast, the integer programming formulation we advocate in this paper involves minimization of a single global objective function that allows us to link legitimate seed points while rejecting spurious ones by finding the optimum solution to within a small user-specified tolerance. Furthermore, all the spanning tree approaches rely on the local tubularity scores to weight the graph edges. For example, global methods that rely on geodesic distances express this cost as an integral of a function of the tubularity values [50], [7]. Similarly, active contour-based methods typically define their energy terms as such integrals over the paths [18], [44]. Since the tubularity values only depend on local image evidence, they are not particularly effective at ignoring paths that are mostly on the curvilinear structures but also partially on the background. Moreover, because the scores are computed as sums of values along the path, normalizing them so that paths of different lengths can be appropriately compared is non-trivial. By contrast, the path classification approach we introduce in Section 5 returns comparable probabilistic costs for paths of arbitrary length. Furthermore, our path features capture global appearance, while being robust to noise and inhomogeneities. Last but not least, none of the existing approaches addresses the delineation problem for loopy structures. For the cases where spurious loops seem to be present, for example due to insufficient imaging resolution in 3D stacks or due to projections of the structures on 2D [51], some of the above-mentioned methods [7], [5] attempt to distinguish spurious crossings from legitimate junctions by penalizing sharp turns and high tortuosity paths in the solutions and SUBMITTED TO IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE ON 28 AUG 2014 (a) (b) (c) (d) 4 (e) Fig. 3. Algorithmic steps. (a) Aerial image of a suburban neighborhood. (b) 3-D scale-space tubularity image. (c) Graph obtained by linking the seed points. They are shown in red with the path centerlines overlaid in green. (d) The same graph with probabilities assigned to edges using our path classification approach. Blue and transparent denote low probabilities, red and opaque high ones. Note that only the paths lying on roads appear in red. (e) Reconstruction obtained by our approach. introducing heuristics to locally and greedily resolve the ambiguities. They do not guarantee global optimality and, as a result, easily get trapped into local minima, as reported in several of these papers. This is what our approach seeks to address by looking for the global optimum of a welldefined objective function. 3 A PPROACH We first briefly outline our reconstruction algorithm, which goes through the following steps depicted by Fig. 3: 1) We first compute a tubularity value at each image location xi and radius value ri , which quantifies the likelihood that there exists a tubular structure of radius ri , at location xi . Given an N -D image, this creates an (N + 1)-D scale-space tubularity volume such as the one shown in Fig. 3(b). 2) We select regularly spaced high-tubularity points as seed points and connect pairs of them that are within a given distance from each other. This results in a directed tubular graph, such as the one of Fig. 3(c), which servers as an overcomplete representation for the underlying curvilinear networks. 3) Having trained a path classifier using such graphs and ground-truth trees, we assign probabilistic weights to pairs of consecutive edges of a given graph at detection time as depicted by Fig. 3(d). 4) We use these weights and solve an integer program to compute the maximum-likelihood directed subgraph of this graph to produce a final result such as the one of Fig. 3(e). These four steps come in roughly the same sequence as those used in most algorithms that build trees from seed points, such as [49], [7], [18], [8], but with three key differences. First, whereas heuristic optimization algorithms such as MST followed by pruning or the k-MST algorithm of [8] offer no guarantee of optimality, our approach guarantees that the solution is within a small tolerance of the global optimum. Second, our approach to scoring individual paths using a classifier instead of integrating pixel values as usually done gives us more robustness to image noise and provides peaky probability distributions, such as the one shown in Fig. 3(d), which helps ensure that the global optimum is close to the ground truth. Finally, instead of constraining the subgraph to be a tree as in Fig. 2(c), we allow it to contain cycles, as in Fig. 2(d), and penalize spurious junctions and early branch terminations. In the results section, we show that this yields improved results over very diverse datasets. 4 F ORMULATION In this section, we first discuss the construction of the graph of Fig. 3(c), which is designed to be an over-complete representation for the underlying network of tubular structures. We then present an IP formulation that constrains the reconstruction to be a directed tree and finally extend it to allow cycles in the solution. 4.1 Graph Construction We build the graphs such as the one depicted by Fig. 3(c) in four steps. We first compute the Multi-Directional Oriented Flux tubularity measure introduced in [52]. This measure is used to assess if a voxel lies on a centerline of a filament at a given scale. We then suppress non-maxima responses around filament centerlines using the NMST algorithm [52], which helps remove some artifacts from further consideration. More specifically, the NMST algorithm suppresses voxels that are not local maxima in the plane that is perpendicular to a local orientation estimate and within the circular neighborhood defined by their scale. It then computes the MST of the tubularity measure and links pairs of connected components in the non-maxima suppressed volume with the MST paths. Next, we select maximum tubularity points in the resulting image and treat them as graph vertices, or seeds, to be linked. They are selected greedily, at a minimal distance of d to every point that was selected previously. Finally, we compute paths linking every pair of seed points within SUBMITTED TO IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE ON 28 AUG 2014 (a) (b) (c) Fig. 4. Considering geometric relationships between edges helps at junctions. (a) A closeup of the graph built by our algorithm at a branching point. (b) Minimizing a sum of individual path costs yields these overlapping paths. (c) Accounting for edge pair geometry yields the correct connectivity. a certain distance l(d) from each other using the minimal path method in the scale space domain [50]. We take the geodesic tubular path connecting vertices i and j to be pij = argmin γ Z 1 0 (1) P (γ(s)) ds, where P is the negative exponential mapping of the tubularity measure, s ∈ [0, 1] is the arc-length parameter and γ is a parametrized curve mapping s to a location in RN +1 [10]. As mentioned earlier, the first N dimensions are spatial ones while the last one denotes the scale. The minimization relies on the Runge-Kutta gradient descent algorithm on the geodesic distance, which is computed using the Fast Marching algorithm [53]. 4.2 Standard Formulation Formally, the procedure described above yields a graph G = (V, E), whose vertices V represent the seed points and directed edges E = {eij | i ∈ V, j ∈ V } represent geodesic tubular paths linking them. Algorithms [49], [7], [18], [8] that rely on this kind of formalism can all be understood as maximizing an a posteriori probability given a tubularity image and optionally a set of meta parameters that encode geometric relations between vertices and edge pairs. For example, in [8], the tree reconstruction problem is addressed by solving min X t∈T (G) eij ∈E caij tij + X eij , ejk ∈E cgijk tij tjk , (2) where T (G) denotes the set of all trees in G, tij is a binary variable indicating the presence or absence of eij in tree t. caij represents the cost of an edge, which can be either negative or positive and is computed by integrating pixelwise negative log-likelihood ratio values along the path connecting the vertices, while cgijk encodes the geometric compatibility of consecutive edges. As shown in Fig. 4, these geometric terms are key to eliminating edge sequences that backtrack or curve unnaturally. This approach, however, has three severe shortcomings. First, because the caij and cgijk are computed independently, they are not necessarily commensurate or consistent with each other. As a consequence, careful weighting of the two terms is required for optimal performance. Second, it disallows 5 cycles by preventing vertices from being shared by separate branches, as is required for successful reconstruction cases such as the one of Fig. 2. Finally, optimizing using a heuristic algorithm [54] does not guarantee a global optimum. We address the first issue by computing probability estimates, not on single edges, but on edge pairs so that both appearance and geometry can be accounted for simultaneously. Given such estimates, we solve the remaining two problems by reformulating the reconstruction problem not as the heuristic minimization of an energy such as the one of Eq. 2 but as a solution of an IP that allows loops while penalizing the formation of spurious junctions. In the following, we first introduce our probabilistic model and consider the simpler acyclic case. We then extend our approach to networks that contain loops. 4.3 Probabilistic Model Let F = {eijk = (eij , ejk )} be the set of pairs of consecutive edges in G. By analogy to the binary variable tij of Eq. 2, let tijk denote the presence or absence of eijk in the curvilinear networks and Tijk be the corresponding hidden variable. Let also t and T be the set of all tijk and Tijk variables respectively. Given the graph G and the image evidence I, we look for the optimal delineation t∗ as the solution of t∗ = argmax P (T = t|I, G) , t∈Pc = argmax P (I, G|T = t)P (T = t) , t∈Pc = argmin − log(P (I, G|T = t)) − log(P (T = t)), (3) t∈Pc where we used the Bayes’ rule and assumed a uniform distribution over the image I and the graph G. The binary vector t belongs to the set Pc of binary vectors that define feasible solutions. In the following, we present two different ways of defining Pc and the two likelihood terms of Eq. 3. 4.4 Tree Reconstruction In this section, we take Pc to be the set of all directed trees T (G) of G, and use a uniform prior, which amounts to assuming that all tree shapes are equally likely. We therefore drop the prior term log(P (T = t)) from Eq. 3. Furthermore, we assume conditional independence of image evidence along the edge pairs {eijk }, given that we know whether or not they belong to the tree structure. We therefore represent the likelihood term P (I, G|T = t) as a product of individual edge pair likelihoods. As we will show in Appendix B, this leads to ∗ t =argmin t∈T (G) e X − log ijk ∈F =argmin t∈T (G) e X ! P (Tijk = 1|Iijk , eijk ) tijk P (Tijk = 0|Iijk , eijk ) cijk tijk , (4) (5) ijk ∈F where Iijk represents image data around the tubular path corresponding to eijk . The probability P (Tijk = 1|Iijk , eijk ) denotes the likelihood of the edge pair eijk belonging to the tree structure, which we compute based on global appearance and geometry of the paths as described in Section 5. The cijk variables represent the probabilistic likelihood ratios we assign to the edge pairs. As discussed to the constraints t 2theT the (G) amounts to have solving in Section 2,In we this moreweeffective at tubularity measure along the path. In found thea following, bularity measure along path. the we tubularity measure along the path. In following, the following, we m arborescence problem [55] with a quadratic cost. distinguishing legitimate paths from spurious show that optimizing the objective function of Eq. 5 with ones than owshow that optimizing the objective function of Eq. 5Eq. with that the optimizing the objective function of 5 with ecomposing indicator variable t introduced ijk more standard methods, such as those MITTED TO IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 6 respect to the constraints t T (G) amounts to solving spect to theto constraints t 2 Tt (G) amounts to , solving a athat aintegrate a as the product of constraints the tij and tjk the respect the 2 T (G) amounts toANALYSIS solving SUBMITTED TOtwo IEEE variables TRANSACTIONS ON PATTERN ANDfollowing, MACHINE INTELLIGENCE ON 28 AUG 2014 6 tubularity measure along the path. In the we Fig. 5. A loopy graph with a root vertex r (in red). Allowing minimum arborescence problem [55] with a quadratic cost. inimum [55] [55] with a quadratic quadratic cost.cost. zation ofarborescence Eq. arborescence 5 can be problem reformulated as the minimum problem with a quadratic By decomposing indicator tijkfunction introduced vertex c (in to be used by the two different branches show that the optimizing the of green) Eq. 5 with By decomposing the indicator variable tvariable mihood ijkobjective Byabove decomposing the indicator variable tintroduced X ijk tintroduced we to to the edge pairs. As discussed asassign the product of the two variables and tjk , the (denoted by blue and ayellow arrows) produces a loopy soij, the oveabove asratios the product of the two variables t and t respect the constraints t 2 T (G) amounts to solving ij jk argmin tij found ttwo (6) as 2, thewe product of the tij effective and tlution the in the Section 2, this more at distinijk we jk 5. graph A loopy with a the root vertex rAllowing (in red). Allowing jk ,quadratic Fig. 5. ofAFig. withgraph a root vertex r crossing (in red). minimization of cEq. 5 can bevariables reformulated asatthe instead tree. However, describing in he Section have found this more effective inimization 5jk can be reformulated as the Fig. 5.aloopy A loopy graph with a used root vertex (in red). Allowing t2T of (G)Eq. problem with a more quadratic cost. eij 2E5 canarborescence minimization of,eminimum Eq. be reformulated asquadratic the[55] quadratic vertex c c,(in green) toc,by be by ther two different branches guishing legitimate paths from spurious ones than vertex c (in green) to be used the two different branches program terms of edge pairs {i, j} and {k, l} being active and all nguishing legitimate paths from spurious ones than vertex c (in green) to be used by the two different branches ogram X By decomposing the indicator variable t introduced X ijk program (denoted by blue and yellow arrows) produces a loopy X (denoted blue blue andc,and yellow arrows) produces a loopy so- so- somethods, such as those a tubularity edge pairstby containing suchyellow as {k,arrows) c, j}, being inactive not allstandard choices of binary values for indicator argmin cthe tthe ij tintegrate ijk jk (denoted produces a the loopy eer,standard methods, such as that integrate aother above as(G) the product oftthat two variables tij(6) and the argmin cijkthose tcijijk tjk (6) jk , by lution instead of a tree. However, describing crossing in Allowing argmin t (6) ij jk t2T lution instead of a tree. However, describing the crossing in rin(in red). makes it possible to eventually recover the tree topology. Fig. 5. A loopy graph with a root vertex es givemeasure rise tot2T aalong plausible delineation. above eijof ,ejkEq. 2Ethe measure along thepath. path. In following, we showasthat (G) lution instead of a tree. However, describing the crossing minimization 5 The can be reformulated the quadratic eij ,ejk 2E t2T (G) larity the terms of edge pairs {i, c, j} and {k, c, l} being active and all eij ,ejk 2EIn the following, we terms pairspairs {i, c, j}cc,and {k, c,{k, l}toc, being active all vertex (in be used by and the and two all different branches zation must therefore beobjective carried out subject to terms of edge j} green) and l}as being optimizing function of Eq.5the 5for with toof edge program other pairs{i, containing such {k, arrows) c,active j}, being inactivea loopy so X notthe allof choices of binary values the respect indicator wnts thatt optimizing the objective function of Eq. with otherother edge pairsedge containing c, such asc,{k, c,{k, j}, being inactive (denoted byc,blue and yellow produces However, not all choices binary values for the indicator 2However, T (G) to ensure that the solution is a edge pairs containing such as c, j}, being inactive However, not all choices of binary values for the indicator argmin c at4.5 (6)it to the constraints ∈to(G) T a(G) amounts to solving a tminimum makes possible to eventually recover the tree topology. the crossing in ijThe jk makes variables rise plausible delineation. above itReconstruction possible eventually recover tree topology. Loopy ect to the 2tconstraints amounts to solving lution instead ofrecover a the tree. However, describing riables give rise give tothese atto plausible delineation. The above ed variables tree. Weconstraints define by(G) adapting theijk makes it possible to eventually the tree topology. t2T give rise aTtherefore plausible delineation. The above ea minimization must be carried out subject to the ij ,e jk 2E arborescence problem [55] with quadratic cost. terms of edge pairs {i, c, j} and {k, c, l} being active and al mustmust therefore be carried subject to the kinimization flow formulation presented in [55], which provides mum arborescence problem [55] with aout quadratic cost. minimization therefore be carried out subject tointroduced the is a previously, allowing branches to cross proAs discussed constraints t indicator Tensure (G) to ensure that the tijk solution By decomposing the indicator variable other edge pairs containing c, such as {k, c, j}, being inactive nstraints t 2 T (G) to that the solution is a system with a polynomial number of variables yact decomposing the variable t introduced However, not all choices of binary values for the indicator ijk the solution is a constraints t 2tree. (G) to of ensure that 4.5 Reconstruction Loopy Reconstruction duces cyclic graphs as the one shown in Fig. recover 5. above theT product the two tijdelineation. and t4.5 ,the the We define these adapting jk Loopy makes it possible to eventually the tree topology. nstraints. Assuming that the root vertex vvariables ofby the nnected tree. Weasdefine these constraints adapting the ve as theconnected product of the two variables and tjk , by the variables give rise totconstraints The abovesuch 4.5 Loopy Reconstruction rplausible ijaby connected tree. We define these constraints adapting the minimization ofpresented Eq. 5 can reformulated as However, the Fig. quadratic network flow formulation presented in [55], which provides 5. inA some loopy graph a root vertex r (in red). Allowing cases, with such as when delineating the to cross protree isflow given, express these constraints asquadratic twork formulation inbe [55], which provides Astodiscussed previously, allowing branches mization offlow Eq.we 5 minimization can be reformulated as[55], the must therefore be carried out the network formulation presented in which provides As subject discussed previously, allowing branches tobranches cross pro-proprogram a compact system with a polynomial number of variables As discussed previously, allowing branches to cross vertex c (in green) to be used by the two different Xcompact X neural of the brightfield and brainbow images system with a polynomial variables ram constraints t 2 Tnumber (G)number toofensure that thestructures solution is a duces cyclic graphs such as the one shown in Allowing Fig. 5. a ycompact system with a polynomial of variables l X Fig. 5. A loopy graph with a root vertex r (in red). and constraints. Assuming that the root vertex v of the duces cyclic graphs such as the one shown in Fig. 5. (denoted by blue and yellow arrows) produces a loopy so 1, 8v 2 V \ {v }, (7) argmin cvertex (6) r rtheWe l duces cyclic graphs such as the one structure shown in Fig. 5. ij t ijk tthese 4.5 that Loopy Reconstruction d constraints. Assuming that vjk ofv(6) the connected tree. define by adapting thecknow shown in Fig. 1, we the underlying argmin c t r constraints andrj constraints. Assuming that the root vertex of the ij troot ijk jk r However, in some cases, such as when delineating vertex (in green) to be used by the two different branches t∈T (G) optimal tree is given, we express these constraints as lution instead aintree. However, describing the crossing inthe the the {v eformulation ,ejk ∈E However, inof some cases, such as when delineating r} ij t2T (G) However, some cases, such as when delineating timal tree is given, we express these constraints as network flow presented in [55], which provides optimal tree is given, we express these constraints as e ,e 2E ij jk truly is a oftree whose we will eventually want to allbranches (denoted by blue and yellow arrows) produces a loopy soAsj} discussed allowing to cross pro X X Xl neural structures of the brightfield and brainbow images terms pairstopology {i, and {k, c,previously, l} and being active and lcompact system with a polynomial number aynot ofedge variables neural structures of c,the brightfield brainbow images XHowever, neural structures ofof the brightfield and brainbow images 1,\choices 2binary V \ {vvalues (7) l yjl V {vV r }, (8)for recover. lution instead a tree. However, describing theone crossing in in Fig. 5 oflr\}, theother indicator r }, 8v l 2all rj l1, 8v In the case of Fig. 5, this means that we need duces cyclic graphs such as the shown edge pairs containing c, such as {k, c, j}, being inactive 8v 2 \ {v (7) wever, noty1, all choices of binary values for the indicator shown in Fig. 1, we know that the underlying structure l yrj 1, 8v 2 V {v }, (7) rj r l and constraints. Assuming that the root shown vertex vinr terms ofin the Fig. 1, we know that the underlying structure vj 2V \{vr } give rise to a plausible delineation. The shown Fig. 1, we know that the underlying structure of edge pairs e and e being active and all other variables above {vl } icj kcl makes it possible to eventually recover the tree topology. to be able to distinguish onewhose branchin from other. One ables rise a plausible delineation. The above However, somethecases, such as when delineating the vj 2V \{v r }\{v is a tree topology we will eventually want to vjgive 2V } to optimal tree is given,bewe express these constraints as rX edge pairs containing c, such aswill ekcjeventually , being inactive it minimization must therefore carried outthe subject to the X truly is atruly tree whose topology we will eventually want to makes l X truly is a to tree whose topology weof want to X 8v V \8v {vl r2 },V subject mization must therefore out l l yjl X l 2carried 1,be \ {vr }, to (8) approach would be first recover the subgraph defined by l neural structures the brightfield and brainbow images l recover. In the case of Fig. 5, this means that we need yij y = 0, (9) l to case eventually recover the means treethat topology. (G) to that the {v solution isrecover. a Inpossible ji 1,t 8v yjl y1, 22 VyVlrj\2 },ensure (8)r },(8) l iT 8v V \ r }, recover. the of Fig. 5, this means we need jl 8v \{v {vr1, ,{v vl }, traints t constraints (G) to ∈8v ensure that solution In(7)case the ofinFig. 5,1,this that we underlying need l 2 Vis \ a rthe vj2 2V T \{v } active edges and thenshown to itsknow topology. Fig.assess that the vj 2V \{vi ,vl }l to Reconstruction be able toattempt distinguish onewe branch from the other. One structure connected tree. We define thesebyconstraints bythe adapting the r }\{v vij,v 2V 4.5 Loopy vj 2V } define l } \{vWe nected tree. these constraints adapting the lX vj 2V \{v } to be able to distinguish one branch from the other. One r to be able to distinguish one branch from the other. One X However, to consistently enforce constraints on 8v 2 in V provides \ {vr }, formulation presented [55], provides l X X truly is topology we will eventually l X l X Xflow approach would beageometric totree firstwhose recover the subgraph defined by want to work formulation presented in [55], which yij flow tij , lnetwork ,lv\2 vVjlr}, (10) whichAs yl\ji =8v 0, (9) 2 rV i ,{v \}, {v }, l vl l2 V8v l {v ij E, l8eijy2 r\ approach would be to first recover the subgraph defined by discussed previously, allowing branches to cross proapproach would be to first recover the subgraph defined by that we need 4.5 Loopy Reconstruction 8v 2 V {v , v }, a compact system with a polynomial number of variables y 1, 8v 2 V \ {v }, (8) y y = 0, (9) i r l r l y y = 0, (9) jl branches even at junctions, we do both simultaneously by ij recover. In the case of Fig. 5, this means ijwith a ji ji 8v mpact system polynomial number of variables the active edges and then attempt to assess its topology. l 2 V \ {v , v }, vl }, r ∈duces 2V \{v vj 2V \{vi ,vl }i 8vi 2 V r\ {v l r ,(11) i ,vr y2V = tiil,v\{v ,vrj}and 8e 2}i ,v E, edges and then attempt toshown assess itsedges topology. cyclic graphs such as theto one in Fig. 5. from the other. One constraints. Assuming that the root vertex Vthe of active the the active then attempt to assess its topology. il \{v v\{v 2V \{v } vjrAssuming 2V \{v vj 2V } il v 2V ,v lj} i ,v ithat l } l the root vertex vr of thereasoning in terms ofedges whether consecutive pairs of constraints. to and bepreviously, able distinguish one branch As discussed allowing branches to cross proHowever, to consistently enforce geometric constraints on lj l optimal tree is given, we express these constraints as y t , 8e 2 E, v 2 V \ {v , vi2 , vV (10) X X However, to consistently enforce geometric constraints onsubgraph ij ij{vr ,constraints r j }, l However, in some cases, such as when delineating the l is given, However, to consistently enforce geometric constraints on ij 2 l 8eij yij tree 0, E, v 2 V \ v }, (12) mal we express these as 8v \ {v }, i l l l r l belong delineation or not. approach would be to first recover the yij ytijij ,X vE, ,rv,j0, tlij , 8eij 2 8eE, 2ij vl \2{v V ry\,jiv{v v},i , vj },(10) (10) to the final duces cyclic graphs such as the one shown in Fig. 5. defined by i= l 2 V ij y branches even at junctions, we do both simultaneously by (9) lil , branches even at junctions, we5,edges do both simultaneously byassess neural structures ofcase the brightfield and brainbow images even at junctions, we do both simultaneously by the 8v \ {v , vbranches =yE, trj 8eil 2 E, i 2 VFor r(11) l }, tX (13) l l 8eyij l 1}, il 2 ij 2 {0, ≤ 1, ∀l ∈ V \ {r}, (7) example, in the of Fig. edge pairs (e , e ) the active and then attempt to its topology However, in some cases, such as when delineating ic cj reasoning in terms of whether consecutive pairs of edges v 2V \{v ,v } v 2V \{v ,v } y= 1, 8v 2 V \ {v }, (7) yil t , 8e 2 E, (11) (11) r E, j r i 2 i l l il il = til , j il 8e rjyil l shown in )Fig. know that the underlying structure reasoning in 1, terms of whether consecutive pairs of edges reasoning inwe terms of but whether consecutive pairs of edges j∈Vy\{r} 0, 8e (12) , e should belong neither (e , e ) nor lr }l However, to consistently enforce geometric constraints on ij 2 E, vl 2 V \ {vr , vi },and (ekc l neural structures of the brightfield and brainbow images cl ic cl ij l 2V \{v belong to the final delineation or not. the yijyijare yauxiliary continuous that denote y E, t2ijV 2 V \truly {vbelong vaj }, (10) 0, 2 , ij v{v 0, 8e 8eE, v, l \2{v V8e vi },vl 2(12) (12) r , vis i ,belong r\ i }, l variables ijij2v r , E, ij X treethe whose topology we will want toboth simultaneously by to final delineation orknow not. to the final delineation or eventually not. X lij (e , e ). Similarly, consider the vertices labeled i,j,k,l, branches even at junctions, we do t 2 {0, 1}, 8e 2 E, (13) shown in Fig. 1, we that the underlying structure l kc cj y ≤ 1, ∀l ∈ V \ {r}, (8) ij ij For example, in the case of Fig. 5, edge pairs (e , e ) jl l ic cj w fromtijthe vertex to 2 others. specifically, yjl 8v 2E, V2= \E, {v }, 8e (8) lall t 21}, {0, 1}, 8e 2 root {0, 8eij (13) recover. y tilr,More (11) In case 5,of in this means that we need ij 1, ijil Forthe example, inFig. the case of Fig. 5,we edge pairs ) to of edges example, in the case Fig. 5, edge pairs (eof e ic ), ecj il 2 E, ic ,(e and(13) m For in the graph ofa, of Fig. 2(b). In the delineation j∈V \{l} l terms of whether consecutive truly tree topology will eventually and (eiskc ereasoning should belong but neither (ecj nor cl ) whose ic , ewant cl ) pairs cates unique directed path from the root 2V \{vlwhether } wherethe l the y are auxiliary continuous variables that denote to be able to distinguish one branch from the other. One and (e , e ) should belong but neither (e , e ) nor and (e , e ) should belong but neither (e , e ) nor l ij l X X kc cl ic cl kc cl ic cl y 0, 8e 2 E, v 2 V \ {v , v }, (12) ij r i l ij continuous Fig. 2(d), edge pairs (e , e ) and (e , e ) are both active where the y are auxiliary variables that denote belong to the final delineation or not. here the y are auxiliary continuous variables that denote recover. In the case of Fig. 5, this means that we need ∀l ∈ V \ {r}, ij jk mj jl l l (e , e ). Similarly, consider the vertices labeled i,j,k,l, X X ij ij kc cj ertex vllthe traverses the edge eij . jiIfV=the optimal 8v 2 \to {vall yl ijthe − (9) 0, flow from root vertex More specifically, r },others.tree l y would beSimilarly, toboth first recover thevertices subgraph defined by5,i,j,k,l, , etocj ).(13) consider the vertices labeled (ekc ,(e ejcj ). Similarly, consider the labeled i,j,k,l, kc ∀i2More ∈E, V \specifically, {r, yji = 0, (9)l}, tij 2all {0, 1},others. 8e andapproach vertex belongs branches. theycontain flow the root vertex toothers. For example, in the case of Fig. edge One pairs ij l from from the root vertex to More able to distinguish one branch the other. andbe mto in the graph of Fig. 2(b). Infrom the delineation of (eic , ecj ) 8vthe Vall \ {v , ij vldoes }, specifically, se flow not and hence such a path not i 2 unique rdirected j∈Vv\{i,r} j∈V \{i,l} l y indicates whether path from the root the active edges and then attempt to assess its topology. and m in the graph of Fig. 2(b). In the delineation of (e l,vr } ij and m in the graph of Fig. 2(b). In the delineation of \{v v 2V \{v ,v } i j i l In the following, we first describe our cost function that l y indicates whether the unique directed path from the root and (e , e ) should belong but neither approach would be to first recover the subgraph defined by indicates whether the unique directed path from the root l kc cl ic , ecl ) no Fig. 2(d), edge pairs (e , e ) and (e , e ) are both active l ij ij jk mj jl j hen yl ij v= 0. The first two ensure that the constraints yij are auxiliary variables that denote the el ij∈continuous . VIf tree yij ≤where tvijl, traverses ∀e ∈ E, \ the {r, i,optimal j}, (10) r to vertex However, to2(d), consistently geometric constraints onvertices ij edge Fig. edge ,) eand )junctions, and (e eare arethe both active Fig. 2(d), edge pairspairs (e ,(e ecj (e , emj ),and active ij jk jl )both ijenforce jk mj jl penalizes the formation of spurious then ⇤edge v to vertex v traverses the edge e . If the optimal tree (e , e ). Similarly, consider labeled i,j,k,l y t , 8e 2 E, v 2 V \ {v , v , v }, (10) the active edges and then attempt to assess its topology. to vertex v traverses the e . If the optimal tree ij ij r i j l r l ij kc and vertex j belongs to both branches. ij ⇤ l ij an be⇤ att most one path in tfrom the root to to each the flow root vertex all others. More specifically, l not does vfrom and hence such a path does l the branches even atbelongs junctions, we do both simultaneously by and vertex j belongs to inboth branches. andnot vertex jIn toconsistently both branches. yil = lv tilcontain , and ∀esuch E, (11) il ∈ such l does introduce the constraints that allow graph vertices to be t not contain v and hence a path does not and m the graph of Fig. 2(b). In the delineation o However, to enforce geometric constraints on does not contain hence a path does not l the following, we first describe our cost function that l l y = t , 8e 2 E, (11) n theil graph. third one conservation ofdirected il The yil ij indicates whether the unique pathIn from the root exist, then = 0.enforces The first two constraints ensure that reasoning terms of whether consecutive pairs of edges Ininseparate the following, we first describe our cost function that ll ≥yij0, the following, we first describe our cost function that l ∀e E, l ∈ V \ {r, i}, (12) shared among branches. exist, then y = 0. The first two ensure that Fig. 2(d), edge pairs (e , e ) and (e , e ) are both active branches even at junctions, we do both simultaneously by ij ∈constraints l ij ist, then y = 0. The first two constraints ensure that ij jk mj jl penalizes the formation of spurious junctions, and then ⇤ ij intermediate vertices vlE, .vertex The vijr at edge Iftopenalizes the optimal tree yij there 0, ij can 8e 2to vl one 2remaining Vvl⇤path \traverses {vr⇤,in viconstraints }, (12)eroot ij . belong be most t the from the each to thereasoning final penalizes thedelineation formation ofjnot. spurious and the formation of or spurious and pairs thenthen ⇤canatbe onenot path in from thehence root to each and vertex belongs tojunctions, both branches. in terms of whether consecutive of edges tijat∈tmost 1}, ∀e E, (13) ere can be most one path inthe tthird from root to(13) each ij t∈ to introduce the constraints thatjunctions, allow graph vertices to be ee there that includes aij⇤{0, path root vertex contain vthe and such a path does not lthe the graph. The one enforces conservation of tij 2tvertex {0, 1},in 8e 2does E,from For example, in the case of Fig. 5, edge pairs (e , e ) be introduce the constraints that allow graph vertices to be ic cj introduce the constraints that allow graph vertices to 4.5.1 Cost Function ⇤ third vertex in the graph. The one enforces conservation of belong to the final delineation or not. In the following, we first describe our cost function tha l rtex in the graph. The third one enforces conservation of shared among separate branches. l ng through edge eexist, t then contains yij =econtinuous two constraints ensure that separate ilyif are il flow at intermediate vertices v0.l. . The The first remaining constraints the auxiliary variables that denote andshared (eshared ecl )among should belong but neither (eicof , eedge ) nor branches. l where ij continuous kc ,among cl separate branches. ere the yij are auxiliary variables that denote at intermediate vertices v . The remaining constraints For example, in the case of Fig. 5, pairs e and penalizes the formation spurious junctions, and then ⇤ e offlow images contain several disconnected curvil ow atour intermediate vertices v . The remaining constraints icj We seek to minimize the sum of the two negative log⇤ l vertex there can be at amost one path in t tofrom the to each consider the vertices labeled i,j,k,l, guarantee that tthe includes path from the root the vertex theToflow from root to all others. More specifically, (ekc , ecjroot ). Similarly, ⇤ ⇤ avoid structures. having to process them seflow fromthat the root to all others. More specifically, guarantee that tvertex includes a path from the root to the vertex e should be active in the final solution but neither e introduce the constraints that allow graph vertices to be kcl icl arantee t includes a path from the root to the vertex 4.5.1 Cost Function likelihood terms of Eq 3. In this section, we take the ⇤ l vertex inedge thethe graph. third one conservation of Function through eilunique tThedirected contains eilenforces . from whether path the root lypassing and m in Cost the graph of Fig.among 2(b). In thethe delineation of 4.5.1 Cost ⇤ ifpath ij indicates lly a vgreedy manner, which may result in some 4.5.1 Function ⇤ if ndicates whether the unique directed from the root vin through edge e t contains e . nor e . Similarly, consider vertices labeled i,j,k,l, shared separate branches. l passing il il kcj passing through edge e if t contains e . first one as an image data and path geometry likelihood ∗ il . Ifdisconnected of our images contain several curviatil intermediate vertices vthe remaining Wepairs seek(eto the of the two negative logl . The rSome to vertex l traverses the edge edisconnected optimal tree t constraints being “stolen” byflow the first structure we Fig. 2(d), edge ,minimize ethe ) and ,sum ejl )2(b). are both active ij mj Some of images our images contain curviWe seek to minimize the (e sum of the two negative log- of of contain several disconnected curviosSome vertex vour the edge eijt⇤.several Ifhaving thereconstruct optimal tree and mone inijas the graph ofof Fig. In the delineation We seek tovertex minimize sum the two negative logl traverses linear structures. To avoid to process them seterm, and the second ajkjunction prior that penalizes guarantee that includes a path from the root to the likelihood terms of Eq 3. In this section, we take the refore a suboptimal solution, we connect them all. does not contain l and hence such a path does not exist, linear structures. To avoid having to process them seand vertex j belongs to both branches. near structures. To avoid having to process them selikelihood terms of Eq 3. In this section, we take the and quentially a greedy manner, may result inlikelihood some does notare contain and hence such a which path does not Fig. 2(d), edge eFunction and epath aretake boththe active 4.5.1 Cost terms of Eq 3.pairs In this l set ijk section, mjlwe bifurcations terminations. la v ng we given ain R of first root vertices, for v= through edge eone t⇤ensure contains . following, first one asor an image data and geometry likelihood quentially in greedy manner, which may result inunwarranted some l passing il if then yij 0. The two constraints thatIneil there the we first describe our cost function that entially a greedy manner, which may result in some lin branches being “ stolen” by the first structure we reconstruct first one as an image data and path geometry likelihood t, then yijcurvilinear = 0. The first by twothe ensure that As vertex j belongs toand both branches. we show in Appendix assuming conditional indefirst one term, as anand image data path likelihood nderlying network ofconstraints interest, we create Some our images contain several disconnected curviWeC, toone minimize the and sum ofthat thepenalizes two negative log branches being “stolen” first we reconstruct the second as ageometry junction prior anches being “stolen” bysuboptimal theofpath first wethe reconstruct be at most one in t∗structure root topendence each vertex and therefore apath we connect them all. penalizes theand formation ofseek spurious junctions, ⇤ structure ofand the image evidence given the true values ofthen term, the second one as afirst junction prior that penalizes vertex vcan and connect itstructures. to teach vsolution, 2from Rroot by zero edl can be at one in from to each linear avoid having to process them seIn the following, we describe our cost function that v most rTothe term, the second one as a junction prior that penalizes and therefore a suboptimal solution, we connect them all. likelihood terms of Eq 3. In this section, we take the a suboptimal we them all. unwarranted or terminations. Assuming we are given aone set Rconnect of root vertices, one for variables in the graph. Thesolution, third enforces of result flow introduce the constraints that allow graph to be and then the may random Tbifurcations , bifurcations the first term can bevertices rewritten ge therefore pairs containing all other vertices to which vconservation ijkthe quentially in aof greedy manner, which inpenalizes some unwarranted or terminations. r is Assuming we are given a set R of root vertices, one for ex in theeach graph. The third one enforces conservation of formation of spurious junctions, unwarranted bifurcations or terminations. ssuming we are given a set R root vertices, one for As we show in Appendix C, assuming conditional indefirst one as an image data and path geometry likelihood underlying curvilinear of interest, we create at intermediate vertices i“stolen” ∈network V .ofThe remaining constraints ascreate ed. To reconstruct multiple disconnected structures shared branches. branches by the first structure weamong reconstruct Aspendence weseparate show in Appendix C, assuming conditional indeunderlying curvilinear interest, we As we show in Appendix C, assuming conditional indeateach intermediate vertices vlnetwork .being Thenetwork remaining constraints introduce the constraints that allow graph vertices to be ch underlying curvilinear of interest, we create of the image evidence given the true values of a virtual vertex v and connect it to each v R by zero ∗ term, and the second one as a junction prior that penalizes v r X wea therefore replace vand the constraints of Eqs. and atosuboptimal them all. guarantee tinincludes aiteach path from the27root towe theconnect vertex rtherefore pendence of image the variables image evidence given the true values of virtual vthat connect to each vsolution, Rto by zero ⇤vertex v connect r pendence of the evidence given the true values of rewritten the random T , the first term can be virtual vertex v and it v 2 R by zero cost edge pairs containing all other vertices which v is antee that t includes a path from the root to the vertex log(P (I, G|T = t)) = c w t , (15) shared among separate branches. ijk v r ijk ijk ijk r ∗ unwarranted bifurcations or terminations. v and edge add the following constraints thecontains problem Assuming we are given a which setto R root vertices, one forvariables the random Tijk the first be rewritten pairs containing all eother which v4.5.1 eilvof .r is l passing through ifto tvertices r is il Cost Function the random variables Tijke,ijk the, first termterm can can be rewritten ⇤other as stvcost edgethrough pairs containing alltedge vertices connected. Toeilreconstruct multiple disconnected assing edge ifimages contains eilseveral . to As we 2F show in Appendix C, assuming conditional inde each underlying curvilinear network ofstructures interest, we create as connected. To reconstruct multiple disconnected structures Some of our contain disconnected curviX as t = 1, 8e 2 E : v 2 R. (14) nnected. To reconstruct multiple disconnected structures at weavr therefore replace vrand in the constraints ofWe Eqs. 7- Rtoby vronce, r omeat of our images contain several minimize theFunction sum of X thet)) two negative logpendence of the image evidence given the (15) true values o X virtual vertex vdisconnected connect itoftoEqs. each zero Cost v the rc 2 once, we therefore replace vthe constraints 7-vseek avoid having toofcurviprocess them se- 4.5.1 log(P (I, G|T = cijk wis ijk tijk , r in where ratio ofvariables Eq. 5= and w we therefore replace vTo in constraints Eqs. 712linear with vstructures. and add the following constraints to the problem ijk is the likelihood ijk rpairs v cost aronce, structures. To avoid having to process them selog(P (I, G|T = t)) = c w t , (15) the random T , the first term can be rewritten ijk ijk ijk edge containing all other vertices to which v is likelihood terms of Eq 3. In this section, we take the ijk rseek 12 with vv add and add the following constraints the problem log(P (I,function G|T = t)) = thepairs csum w tijke,two (15) quentially infollowing a greedy manner, result in some ensures that all the the root vertices will which be intomay the 2F ijk ee ijk ijk a learned compatibility of edge and We to minimize of the with vin constraints to the problem ij jk . negative logv and ntially a greedy manner, which may result in some eijk 2F asdata connected. To reconstruct multiple disconnected structures t = 1, 8e 2 E : v 2 R. (14) branches being “stolen” by the first structure we reconstruct vr vr r first one as an image and path geometry likelihood e 2F ijkEq n.ches being “stolen” We (14) model the second likelihood term of a Bayesian likelihood terms oflikelihood Eq. 3. 3Inas this tvr =8e 8e 2: Ev :2 vreplace R. vr in the the first structure we r 2 reconstruct once, we of Eqs. 7- cijk where isasthe ratio ofsection, Eq. X 5 we and take wijkthe is 1,by 2 vr Etherefore R. (14)constraints and ttherefore a1,vr suboptimal we connect them and all. vr = at r solution, term, thefirst second one a junction prior that penalizes where c is the likelihood ratio of Eq. 5 and w isijke tijk log(P (I, G|T = t)) = c w (15 thereforewhich a suboptimal solution, we connect them all. ijk ijk one as an image data and path geometry likelihood ensures that all the root vertices will be in the 12 with v and add the following constraints to the problem a learned compatibility function of edge pairs e and . , where c is the likelihood ratio of Eq. 5 and w is v Assuming we are given a set V ⊂ V of root vertices, ij jk ijk ijk r which ensures that allRroot the root vertices will bethe in unwarranted the a learned bifurcations or terminations. compatibility function ofa edge pairs eeijk and ejk . 2F uming we are given a set of root vertices, one for ij which ensures that all the vertices will be in term, and the second one as junction prior that penalizes solution. We model the second likelihood term of Eq 3 as a Bayesian a learned compatibility function of edge pairs e and e . one for each underlyingtvrcurvilinear structure ij jk = 1, 8evr E : vrof2 interest, R. (14) As we Appendix C, assuming conditional indesolution. Weshow model the second likelihood term a Bayesian underlying network ofvinterest, we2create lution. We theinsecond likelihood term of Eqof3Eq as 3a as Bayesian we curvilinear create a virtual vertex and connect it to each r ∈model Vr of unwarranted bifurcations or terminations. where c is the likelihood ratio of Eq. 5 and wijk is ijk pendence the image evidence given the true values of rtual vertex connect itpairs to each v 2 R all by other zero vertices to v andcost by vzero edgeensures containing assuming conditional inde-eij and ejk which that rall the root vertices will be variables inAsthewe show learned compatibility function of edge pairs the random Ta ijk , in theAppendix first termC,can be rewritten edge pairs containing all other vertices to which v is r which r issolution. connected. To reconstruct multiple disconnected pendence We of the image evidence given theterm trueofvalues ofa Bayesian model the second likelihood Eq 3 as as nected. Tostructures reconstruct multiple disconnected structures at once, we therefore replace r in the constraints the random variablesX Tijk , the first term can be rewritten nce, we therefore vr in of Eqs. 7-constraints to as of Eqs. replace 7-12 with v the andconstraints add the following log(P (I, G|T = t)) = cijk wijk tijk , (15) with vv andthe add the following constraints to the problem X problem eijk 2F tvr = 1, 8evr 2 E : vr 2 R. (14) tvr = 1, ∀evr ∈ E : r ∈ Vr . − log(P (I, G|T = t)) = cijk wijk tijk , (15) where(14) cijk is the likelihood ratio of Eq. eijk 5 ∈F and wijk is ich ensures that all the root vertices will be in the a learned compatibility function of edge pairs eij and ejk . which ensures that all the root vertices will be in the is the likelihood Eq. 5 and wijk is a tion. We model thewhere secondcijk likelihood term of Eqratio 3 as of a Bayesian learned compatibility function of edge pairs eij and ejk . We solution. SUBMITTED TO IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE ON 28 AUG 2014 model the second likelihood term of Eq. P 3 as a Bayesian network with latent variables Mij = emij ∈F Tmij and P Oij = T , which denote the true number of ijn eijn ∈F incoming and outgoing edge pairs into or out of edge eij . Furthermore, let pt = P (Oij = 0|Mij = 1) and pb = P (Oij = 2|Mij = 1) respectively be the prior probability that a branch terminates and bifurcates at edge eij . Assuming that the edge pairs {eijn } are independent of the other edge pairs, given the true state of the edge eij , and by inspecting the probability of admissible events, namely termination or bifurcation at eij , we write the prior term − log(P (T = t)) as − X eij ∈E " X log(pt )tmij + emi ∈E X X ejn ∈E X log # 1 pt (16) which we derive in Appendix D. In short, minimizing the negative log likelihood of Eq. 3 amounts to minimizing, with respect to the indicator variables t, the criterion X aijk tijk + eijk ∈F X bijkn tijk tijn , (17) eijk ,eijn ∈F which is the sum of the linear and quadratic terms of Eqs. 15 and 16 and whose aijk and bijkn coefficients are obtained by summing the respective terms. 4.5.2 Constraints Using the same notation as in Section 4.4, we say that an edge pair eijk is active in the solution if tijk = 1. Similarly, an edge eij is active if there exist an active incoming edge pair containing it (i.e., t.ij = 1). We take the feasible region Pc introduced in Eq. 3 to be the set of subgraphs that satisfy certain connectivity conditions. More specifically, we define four sets of constraints to ensure that the solutions to the minimization of Eq. 17 are such that root vertices are not isolated, branches are edgedisjoint, potential crossovers are consistently handled, and all active edge pairs are connected. Non-isolated Roots: As described in Section 4.4, we assume that we are given a set Vr ⊂ V of root vertices, one for each curvilinear structure of interest in the image. To handle multiple disconnected structures, we create a virtual vertex v and connect it to each given root r ∈ Vr by zero cost edge pairs. We require the root vertices to be connected to at least one vertex other than the virtual one v and to have no active incoming edge other than evr . We write X eri ∈E X tvri ≥ 1, ∀r ∈ Vr , eijr ∈F tijr + X radius to be more than one. Let pijk denote the geodesic path corresponding to edge pair eijk . We write X tnij + eni ∈E: n6=j X emj ∈E: m6=i tmji ≤ 1, ∀eij ∈ E : i 6= v, tirj = 0, ∀r ∈ Vr . (18) (19) eirj ∈F :i6=v Disjoint Edges: For each edge eij ∈ E, we let at most one edge pair be active among all those that are incoming into eij or eji . Second we prevent the number of active edge pairs that overlap more than a certain fraction of their mean (20) (21) (e =e ) ∨ (e =e ) ∨ (e =e ) ∨ mn mn ji ij kj ln (enl 6=eij ) ∧ (enl 6=eji ) ∧ (enl 6=ejk ) ∧ (e 6=e ) ∧ (emn 6=e ) ∧ (emn 6=e ) ∧ ij ji nl kj ∈F : , (emn 6=ejk ) ∧ (emn 6=ekj ) ∧ l(p ∩p ) > αr̄(p ∩p ) ∧ mnl ijk mnl ijk l(p ) < l(p ) tmnl + tijk ≤ 1, ∀emnl , eijk mnl tijn + log(pb pt )tijn tijk , ejn ∈E ejk ∈E k<n 7 ijk where l(.) and r̄(.) denote the length and mean radius of a path respectively. α is a constant value that determines the allowed extent of the overlap between the geodesic paths of the edges. It is set to 3 in all our experiments. In the example depicted by the figure above, among all the edge pairs incoming to the edges eij , ek1 l1 and ek2 l2 , only one can be active in the final solution. For those curvilinear structures that are inherently trees, these constraints make their recovery from the resulting subgraph possible by starting from the root vertices and following the active edge pairs along the paths that lead to the terminal vertices. Crossover Consistency: A potential crossover in G is a vertex, which is adjacent to at least four other vertices and whose in- and out-degrees are greater than one. A consistent solution containing such a vertex is then defined as the one, in which branches do not terminate at it if its in-degree in the solution is greater than one. The figure above illustrates consistent configurations denoted by a swoosh and inconsistent ones denoted by a cross. We express this as X tkij + eki ∈E: k6=j X X tlmq − tiju ≤ 1 , elm ∈E: l6=q,l6=j (22) eju ∈E: u6=i ∀eij ∈ E ∀emq ∈ C(j) : m 6= i C(j) = {enk ∈ E | k = j ∨ (j ∈ pnk , n 6= j, k 6= j)} These constraints are only active when dealing with structures that inherently are trees, such as dendritic and axonal arbors. For inherently loopy ones, such as road and blood vessel networks, we deactivate them to allow creation of junctions that are parts of legitimate cycles. Connectedness via Network Flow: We require all the active edge pairs to be connected to the virtual vertex v. An edge pair eijk is said to be connected if there exists a path in G, starting at v and containing eijk , along which all the edge pairs are active. SUBMITTED TO IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE ON 28 AUG 2014 l (i 6= l) be a non-negative continuous flow variable Let yij which denotes the number of distinct directed paths that connect v to vertex l, and traverse the edge eij in the solution. This gives rise to the following constraint set X X l yil , ∀l ∈ V \ {v} , (23) ≤ deg − (l), ∀l ∈ V \ {v} , (24) X l yjk , ∀j, l ∈ V \ {v} : j 6= l , l yvj = evj ∈E X evj ∈E X eil ∈E l yvj l yij = eij ∈E ejk ∈E eki ∈E l yij ≤ deg − (l) X eki ∈E tkij , ∀eij ∈E:i6=v, ∀l∈V \{v,i,j} l l yij ≥ yjk + deg − (l)(tijk − 1), (26) (27) (28) , ∀eijk ∈F, ∀l∈V \{v}:j6=l,i6=l , (29) where deg − (.) is the in degree of a vertex. The first two constraints guarantee that the amount of flow outgoing from the virtual vertex v to a vertex l is equal to the incoming flow to l, which must be smaller than the in degree of l. The constraints in Eq. 25 impose conservation of flow at intermediate vertices of the directed paths from v to l. The following three constraints bind the flow variables to the binary ones, ensuring that a connected network formed by the non-zero flow variables is also connected in the active edge pair variables. Note that, since we are looking for possibly cyclic subgraphs, there can be multiple paths incoming to a vertex. Hence, unlike the flow variables of Eqs. 7-12, which are bounded by one, the ones defined here have no upper bounds. Finally, the constraints introduced in Eq. 29 ensure that active edge pairs don’t form a cycle disconnected from the virtual vertex. This is achieved by imposing conservation of flow on active edge pairs. These constraints are required to ensure that a tree topology can be recovered from a potentially loopy solution by tracing the active edge pairs from the roots r ∈ Vr to the branch terminals. 4.6 of this section. This new approach eliminates the need for adding the large number, |V |.|E|, of auxiliary flow l variables yij and their constraints to the optimization. More specifically, we require every subset of active edges to be connected to the virtual vertex v. We impose this by extending the enhanced connectivity constraints introduced in [55], which are defined for edges, to the edge pair case as follows: X (25) l yil ≥ tilk , ∀eilk ∈ F , X l yil = tkil , ∀eil ∈ E : i 6= v , Optimization The two IP formulations introduced in Sections 4.4 and 4.5 are generalizations of the Minimum Arborescence Problem, which is known to be NP-Hard [55]. Nevertheless, we obtained globally optimal or near-optimal solutions for all the problem instances discussed in the results section using the branch-and-cut algorithm [56]. More precisely, the objective values of the solutions we found are guaranteed to be within a small distance 2e−16 of the true optimum. However, the branch-and-cut procedure does not scale well to large graphs especially when we use the network flow formulation of Eqs. 23-29. To overcome this, we reduce the size of the graphs by pruning or merging some of the edge pairs based on their assigned costs, which we describe in Appendix E. Furthermore, we introduce a new set of connectivity constraints expressed only in the edge pair variables tijk , which we describe in the remainder 8 tmnl ≥ enl ∈H, emn ∈E\(H∪eij ) X tkij , eki ∈H ∀H⊂E\{evr |r∈Vr }, ∀eij ∈E\H : (∀euw ∈H, ewu ∈H) (30) Note that, since we consider every possible subset H of graph edges, the number of constraints grows exponentially with the number of edges and is too large in practice to be dealt with exhaustively. To make the problem tractable, we therefore take a lazy constraint approach, which requires only a small but sufficient fraction of the constraints to be used. This is done iteratively by identifying those constraints of Eq. 30, whose elimination gives rise to disconnections, and adding them to the problem. More specifically, we start by creating a reduced IP model with the cost function of Eq. 17 and a subset of the constraints given by Eqs. 18-22. We then run the branchand-cut algorithm on this model and check for connectivity violations in the intermediate solutions it produces. These violations are found in linear time by identifying those connected components, which are not connected to the virtual vertex v by a directed path on which all edge pairs are active. Let C be the set of components and Ec be the set of edges for c ∈ C. We add a lazy constraint given by Eq. 30 for each such component by randomly selecting an edge eij ∈ Ec and forming a new set of edges H = {ekl | (ekl ∈ Ec \ eij ) ∨ (elk ∈ Ec \ eij )}. We then continue the branchand-cut optimization with this new model. We repeat this procedure for every intermediate solution until no more constraints are violated and hence the solution is globally optimal. As will be shown in Section 6.5, it requires only a small fraction of the original constraints to be added in practice and yields significantly faster run times compared to the network flow formulation. 5 PATH C LASSIFICATION The outcome of the IP procedures introduced in the previous section depends critically on the probabilistic weights appearing in their respective cost functions. These weights are assigned to consecutive edge pairs, which correspond to tubular paths linking three vertices in the graph. A standard approach to computing such weights is to integrate a function of the tubularity values along the paths, as in Eq. 1. However, as shown in Fig. 6, the resulting estimates are often unreliable because a few very high values along the path might offset low values and, as a result, fail to adequately penalize spurious branches and shortcuts. Furthermore, it is often difficult to find an adequate balance between allowing paths to deviate from a straight line and preventing them from meandering too much. to computing the probabilimage imagegradient gradient gradientrI(x) rI(x) rI(x)atat atthat that thatpoint. point. point. to computing the probabilpaths paths paths by byby summing summing summing tubularity tubularity tubularity values values values of of of Section Section Section 4.1 4.1 4.1 results results results in, in, in, image normal ray vector emanating from the centerline C p normal ray vector emanating from the centerline C passing from from from left left left to toto right, right, right, shortcuts, shortcuts, shortcuts, spurious spurious spurious branches, branches, branches,and and and missing missing missing ity estimates that we found ity estimates that we found 3 3 branches. branches. branches. (Bottom) (Bottom) (Bottom) Our Our Our classification classification classification approach approach approach to to to scoring scoring scoring byC(s x, and C(sx ) the point tofeatures it as illu by the x, and )three thecases. closest pointclosest to it Finally, as illustrated xallall changes. changes. changes. Finally, Finally, these these these features features are are arefed fed fedin paths paths paths yields yields yields the the right right right answer answer answer ininin all three three cases. cases. to bereliable. more More reliable. More to be more detect detect detecta objects objects objects ofof ofinterest. interest. interest. We Weadapt adapt adaptthis this thissts by Fig. 9. by Each weighted vote Fig.such 9. point Each contributes such point contributes aWe weighte specifically, given aINTELLIGENCE tubular specifically, given a krI(x)k tubular paths paths pathsby by bydefining defining definingHistogram Histogram Histogram ofof ofGradient Gradient Gradien SUBMITTED TO IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE ON 28 to AUG 2014 9 aInthis histogram bin, whichaaa we take to be krI(x)k to awe histogram bin, which we take to be In In this thissection, section, section, we we propose propose propose descriptors descriptors descriptors asas asfollows. follows. follows. path computed as discussed path computed as discussed path-classification path-classification path-classification approach approach approach tototo Given Given Givenaaatubular tubular tubularpath path path (s) (s) (s)such such suchasas asthe th th computing computing computing the theprobability probability probabilityestiestiesti- Fig. in Section we break it break ⇢it ⇢ the in 4.1, Section 4.1, we Fig. Fig.8,8,8,with with withsssbeing being beingthe the thecurvilinear curvilinear curvilinearabsciss abscis absci angle(rI(x), , if kx C(s )k > " mates mates matesthat that thatwe we weN(x)) found found foundtototobe be bemore more more x angle(rI(x), N(x)) , if kx C(s )k > " centerline centerline centerline and and and r(s) r(s) r(s) the the the corresponding corresponding corresponding radi rad ra x down into several (x)= (22) down into segments several segments reliable. reliable. reliable.More More More⇧(x)) specifically, specifically, specifically, given given given partition (x)= angle(rI(x), , otherwise, partition partition the the thepath path pathinto into intoequal-length equal-length equal-lengthoverl ove ove angle(rI(x), ⇧(x)) ,and, otherwise, and compute one feature vecaa atubular tubular tubularpath path pathcomputed computed computedas asasdisdisdisand compute one feature vec(s) (s) (s) and, and, for for for each, each, each, we we we compute compute computehisto hist hist i i i Fig. 8, with s cussed cussed cussed in in in Section Section Section 4.1, 4.1, 4.1, we we we break break break tor based on gradient hisbetween between betweenimage image imagegradients gradients gradientsrI(x) rI(x) rI(x)and and andnorm nor no tor based on gradient hisitititis down down down into into several several severalsegments segments segments plane, theinto cross-sectional use (s)ww obtained obtained obtainedwhich from from fromthe the thewe points points points xxx222 i (s) ii(s) tograms for each. We then use where ⇧(x) and and andcompute compute compute one one onefeature feature feature vector vector vector where ⇧(x) is the cross-sectional plane, which w segment. segment. segment. To To To ensure ensure ensure that that that all all all the the the gradient gradien gradien tograms for each. We then use m(s)/3to = the the deviation angle when axmargin belongs for foreach each each based based basedon on ongradient gradient gradienthishishisan embedding approach [58] to computetofor rounding rounding rounding the the thefixed path path pathissize isiscaptured, captured, captured, we we enlarge enlarg enlarg values. A margin iswe not pref compute the deviation angle when x belongs an embedding approach [58] tograms. tograms. tograms. We We We then then then use use use an an an emememthe normal ray vector is not aaamargin margin margindefined. m(s) m(s) m(s)=== ⇤r(s) ⇤r(s) ⇤r(s)kokomo kokomo kokomoproporti propor propor to compute fixed-size de- centerline and bedding bedding beddingapproach approach approach [57] [57]tototo comcomcom- values. centerline and[57] the normal rayAAvector ismargin notisisdefined. values. values. Afixed fixed fixedsize size size margin margin isnot not notpreferre preferr prefer fixed-size Todeobtain a fixed-size description of from paths’ appearance profile scriptors to fromcompute the potentially Fig. Fig. Fig. 6. 6.6. Tubular Tubular Tubular graph graph graph pute pute pute fixed-size fixed-sizedescriptors descriptors descriptors from from interference interference interferenceofof ofbackground background backgroundgradients gradients gradientsfor for for of ofofFig. Fig. Fig.3(c) 3(c) 3(c)with with withedge edge edge the Topotentially obtainarbitrary a description of paths’ appearance the the potentially potentially arbitrary arbitrary number number number the potentially Fig. 6. Tubular graph arbitrary scriptors on the cross-sectional plane, we further split the neighnumber from of feature Furthermore, Furthermore, Furthermore,toto toobtain obtain obtainaaadescription description description weights weights weightscomputed computed computedby byby of ofoffeature feature featurecross-sectional vectors vectors vectorswe we weobtain. obtain. obtain.FiFiFi-plane, we further split the of Fig.Fig. 3(c)6. with Tubular edge graph arbitrary numberintegrating on the sectional sectional sectional appearance appearance appearance profile, profile, profile, we we we split split split its its it integrating integrating tubularity tubularity tubularity of borhood feature N (nally, ) into Rthese equally spaced radius intervals as vectors we obtain. Finally, values we nally, we we we feed feed feed these these descriptors descriptors descriptors tototo values values along along along the the the paths paths paths nally, weightsof computed in- edge into into intoRRRequally equally equallyspaced spaced spacedradius radius radiusintervals intervals intervalsasaa Fig. 3(c)by with borhood N ( ) into R equally spaced radius interv a a classifier a classifier classifier and and and turn turn turn its its its output output output into into into shown in Fig. 9 and create a histogram for each such vectors we obtain. Finally, we instead instead instead of of of using using using our our our feed and and and andcreate create createtwo two twohistograms histograms histogramsfor for foreach each eachsuch such suchin tegrating tubularity values by weights computed in-these to a classifier Fig. 8.path Three aspects our path path classification classification classification aa aof probability probability probability estimate. estimate. estimate. extraction process. An infeature Fig. 9bottom and create a histogram for each first first first histogram histogram histogram captures captures capturesand the the thestrength strength strength ofof oft interval. Given B orientation bins, the radius interval feed these a classifier and turn its output into to a probaalong the paths instead of values approach. approach. approach. We We We use use usethe the theshown As As Asshown shown showninininthe the thebottom bottom row row row path centerline tegrating tubularity extended neighborhood of points around the inside inside insidean an aninterval, interval, interval,the the thesecond second secondone onecaptu capt cap same same same color color colorangular scheme scheme schemeinterval. the bin indices for point then given by one Given Ba orientation the radius interv using along our path of of ofFig. Fig. Fig. 7,7,7,this this thisapproach approach approach penalpenalpenal- x are bins, bility turn its output into a probathe classificapaths instead of estimate. about about about the the the segment segment segment centerline. centerline. centerline. as as as in in in Fig. Fig. Fig. 3(d) 3(d) 3(d) to to to C(s) isdemonstrate defined ashow the envelope of cross-sectional circles izes izes izes paths paths pathsthat that that mostly mostly mostly follow follow tion approach. We path use the min(R 1,the bRkN(x)k/(r(s )follow + m(sfor ))c) and min(B demonstrate demonstrate how how (s), let let letC(s C(s C(s Given Given aapoint point point xx xx 22are 2 i (s), ii(s), xx xindices xGiven angular bin a apoint then giv using our classificabilityinestimate. As shown Fig. 7, this the the thetree tree treestructure structure structurebut but but cross cross the the the centerline shown in black. This neighborhood iscross divided into R radius much much much less less lessinformative informative informative centerline centerline point, point, point, and and andN(x) N(x) N(x)be bethe the thenorma norma norm same color scheme as in 1, bB (x)/⇡c) respectively. For each segment, pro-be tion approach. We use the these these these weights weights weights are. are. are. C(s C(s )toto to+ x, x, x,as as asthis illustrated illustrated illustrated by by byFig. Fig. Fig.mi 8.8. 8. background. background. background.Thus, Thus, Thus, itititdiscourages discourages discourages C(s min(R 1, bRkN(x)k/(r(s ) m(s ))c) and xx)x)x approach penalizes paths that x intervals highlighted byshortcuts the yellow, green and contributes red tubes (here Fig. 3(d) to demonstrate As shown in Fig. 7, thisR histograms, contributes contributes aaato vote vote vote toto to aaabin bin binofof ofthe the thegrad gra gra shortcuts shortcutsand and andeach spurious spurious spurious branches, branches, branches, duces one corresponding ahistograms. radius same color scheme as in 1, bBalong (x)/⇡c) respectively. Forinterval. each segment, th theR true tree gradient-symmetry gradient-symmetry histograms. histograms. The The Thebin bin bin how much less informative mostly follow = 3) and aintegration histogram created fortototodo. each such which which which the the theintegration integration approach approach approachis along along the the thepath path pathfails fails fails do. do. gradient-symmetry approach penalizes paths that Fig. 3(d) to demonstrate the the the following following following equation: equation: equation: interval. We interpolate points within each such interval to to a Fig. 7. Path classification vs Integration in portion mithese weights are. In In Inthe the the remainder remainder remainderof ofofthis this this section, section, section, we we wedescribe describe describe our our ourpath path path structureof but cross the backduces R histograms, each one corresponding A point x contributes a weighted vote to an angular bin ⇢ ⇢ ⇢ mostly follow the true treethat much data less informative ensure enough votes are used to form theangle(rI(x), histograms. croscopy stacks from the how DIADEM [2]. (Top) features, features, features, embedding embedding scheme, scheme, scheme, and and andtraining training trainingdata data datacollection collection collection angle(rI(x), angle(rI(x), N(x)) N(x)) N(x)), ,if,ififkx kx kx C ground.Scoring Thus, it discourages (x)= (x)= (x)= interval. We interpolate points within each⇧(x)) such inte according toembedding the angle between the normal N(x) and the angle(rI(x), angle(rI(x), angle(rI(x), ⇧(x)) ⇧(x)) , ,otherwise ,otherwis otherwis these weights are. mechanism mechanism mechanism in ininmore more more details. details. details. structure but cross the backpaths by summing tubularity values of Section 4.1 results in, Finally, we normalize each histogram by the number of shortcuts, which is something that integration along the path image gradient ∇I(x)ensure at thatthat point. enough votes where are used tothe form the histo where where⇧(x) ⇧(x) ⇧(x)isisisthe thecross-sectional cross-sectional cross-sectionalplan pla pla ground. Thus, it discourages from left to right, shortcuts, spurious branches, and missing points that voted Deviation for it. Descriptors fails to do. 5.1 5.1 5.1 Histogram Histogram Histogram of ofof Gradient Gradient Gradient Deviation Deviation Descriptors Descriptors each tototocompute compute compute the the thedeviation deviation deviation angle angle angle the the thesp Finally, we normalize histogram by theinininnum branches. (Bottom) Our classification approach to scoring shortcuts, which is something that integration along the path This yields a set of histograms for each segment, which belongs belongstototothe the thecenterline centerline centerlineand and andthe the theno n Gradient Gradientorientation orientation orientationhistograms histograms histogramshave have havebeen been beensuccessfully successfully successfully xxxbelongs Fig. 8 in illustrates flowchart of our approach. In Gradient the points that voted for it. isisisnot paths yields the right answer alldo. threea cases. notdefined. defined. defined.The The Thevotes votes votesare are areweighted weighted weighted applied applied appliedwe to totodetecting detecting detectingobjects objects objects in ininaimages images images and and and recognizing recognizing recognizing fails to combine into single HGD descriptor. p p p Finally, these features are fed into a<not classifier to remainder of this section, we describe our pathchanges. features, < < rI(x), rI(x), rI(x), rI(C(s rI(C(s rI(C(s ) ) ) N(x)) N(x)) N(x)) > > > for for for the the th actions actions actionsin ininvideos videos videos[58], [58], [58],[59], [59], [59], [57]. [57]. [57]. In In In a a a typical typical typical setup, setup, setup, the the the x x x This yields a set of histograms for each segment, Fig. 8 illustrates a flowchart of our approach. In the and and andgradient-symmetry gradient-symmetry gradient-symmetry histograms histogramsrespec respe respe image image image isisis first first first divided divided divided into into into aagrid agrid grid of ofoffixed-size fixed-size fixed-size blocks, blocks, called called detect objects of interest. We adaptblocks, thiscalled strategy for tubularhistograms embedding scheme, and training data collection mechanism we into a single HGD descriptor. We We We compute compute compute the the the radius radius radius inter inte int cells, cells, cells, and and and then then thenfor for foreach each each cell, cell, cell,acombine a 1-D a1-D 1-Dhistogram histogram histogram of ofoforientated orientated orientated 3 remainder of this section, we describe our path features, in more details. For simplicity, we assume that such a point is unique. paths by defining Histogram of Gradient Deviation (HGD) In this section, we propose a angular angular bin bin bin indices indices indices for for for aaa point point point xxx gradients gradients gradients(HOG) (HOG) (HOG)isisisformed formed formedfrom from fromthe the thepixel pixel pixelvotes votes voteswithin within within angular embedding scheme, and training data mechanism )x)) ++ + min(R min(R 1,1,1,bRkN(x)k/(r(s bRkN(x)k/(r(s bRkN(x)k/(r(s it. it.it. Histograms Histograms Histograms from from fromneighboring neighboring neighboringcells cells cellsare are arethen then thencombined combined combined min(R xx descriptors as follows. path-classification approach to collection min(B min(B 1,1,1,bB bB bB (x)/⇡c), (x)/⇡c), (x)/⇡c),where where where BB B isi and and andnormalized normalized normalizedto totoform form form features featuresinvariant invariant invarianttototolocal local localcontrast contrast contrast min(B 3features in more details. For simplicity, we assume that such a point is unique. Given a tubular path γ(s) such as the one depicted by computing the probability estiFig. 8, with s being the curvilinear abscissa, let C(s) be the mates that we found to be more centerline and r(s) the corresponding radius mappings. We reliable. More specifically, given partition the path into equal-length overlapping segments a tubular path computed as discussed in Section 4.1, we break γi (s) and, for each, we compute histograms of angles it down into several segments between image gradients ∇I(x) and normal vectors N(x) and compute one feature vector obtained from the points x ∈ γi (s) within the tubular for each based on gradient his- segment. To ensure that all the gradient information surtograms. We then use an em- rounding the path is captured, we enlarge the segments bedding approach [57] to com- by a margin m(s) = β ∗ r(s) proportional to the radius Fig. 6. Tubular graph pute fixed-size descriptors from values. A fixed size margin is not preferred to avoid biased of Fig. 3(c) with edge the potentially arbitrary number interference of background gradients for thin paths. weights computed by Furthermore, to obtain a description of paths’ crossof feature vectors we obtain. Fiintegrating tubularity sectional appearance profile, we split its tubular segments nally, we feed these descriptors to values along the paths into R equally spaced radius intervals as shown in Fig. 8 a classifier and turn its output into instead of using our and create two histograms for each such interval. While the path classification a probability estimate. approach. We use the As shown in the bottom row first histogram captures the strength of the gradient field same color scheme of Fig. 7, this approach penal- inside an interval, the second one captures its symmetry as in Fig. 3(d) to about the segment centerline. demonstrate how izes paths that mostly follow Given a point x ∈ γi (s), let C(sx ) be the closest the tree structure but cross the much less informative centerline point, and N(x) be the normal ray vector from these weights are. background. Thus, it discourages C(s ) to x, as illustrated by Fig. 8. Each such point x shortcuts and spurious branches, contributes a vote to a bin of the gradient-strength and which the integration approach along the path fails to do. gradient-symmetry histograms. The bin is determined by In the remainder of this section, we describe our path the following equation: features, embedding scheme, and training data collection angle(∇I(x), N(x)) , if kx − C(sx )k > ε (31) Ψ(x)= mechanism in more details. angle(∇I(x), Π(x)) , otherwise, 5.1 Histogram of Gradient Deviation Descriptors Gradient orientation histograms have been successfully applied to detecting objects in images and recognizing actions in videos [58], [59], [57]. In a typical setup, the image is first divided into a grid of fixed-size blocks, called cells, and then for each cell, a 1-D histogram of orientated gradients (HOG) is formed from the pixel votes within it. Histograms from neighboring cells are then combined and normalized to form features invariant to local contrast where Π(x) is the cross-sectional plane, which we use to compute the deviation angle in the special case when x belongs to the centerline and the normal ray vector is p not defined. The votes are weighted by k∇I(x)k and < −∇I(x), ∇I(C(sx ) − N(x)) > for the gradient-strength and gradient-symmetry histograms respectively. We compute the radius interval and the angular bin indices for a point x respectively as min(R − 1, bRkN(x)k/(r(sx ) + m(sx ))c) and min(B − 1, bBΨ(x)/πc), where B is the number of descriptors per path. To derive from them a fixed-size 3) Intersection of ph and pg over 2) their union smaller The ratioismin(l h , lg )/max(lh , lg ) is smaller than 5.2 (BoW) Embedding descriptor, we first use a Bag-of-Words approach than a threshold, taken to be 0.5 inthreshold. our experiments. This is to detect if ph is meandering to to compactly represent their featureThe space. The words of Weanlabel thosenumber negatives that partiallymuch overlap with above procedure produces arbitrary of HGD inside pg .the We used 0.75 for the ratio threshold the SUBMITTED BoW model by descriptors randomly sampling per ANALYSIS path. a To AND derive from them a fixed-size matching pathsINTELLIGENCE as hard-to-classify and those 3) 2014 Intersection of that ph and pg over their union is smalle TO are IEEEgenerated TRANSACTIONS ON PATTERN MACHINE ON 28 examples, AUG 10 predefined number of descriptors from the training descriptor, we firstdata. use a don’t Bag-of-Words approach ones. Inthan threshold, taken to be 0.5 in our experiments overlap as(BoW) easy-to-classify ouraexperiments, For a given path of arbitrary length, we then compute to compactly represent their feature space.examples The words of hard-to-classify constitute 99 of thenegatives neg- that partially overlap with th Wepercent label those BoW model are by randomly sampling a matching paths as hard-to-classify examples, and those tha an embedding of the path’s HGDthe descriptors into thegenerated Embedding 6 R ESULTS ative examples. x x predefined of descriptors the training codewords of the model. Adapting theH embedding V1number x overlap as aeasy-to-classify ones. In our path are data. randomly chosen from length 1sequence . x Finally,from Inwelengths this section, wedon’t first introduce the datasets used to experiments x For Euclidean a given path of arbitrary length, then compute approach of [58], we find the minimum distance hard-to-classify examples distribution learnt from graphs in the training dataset. This constitute 99 percent of the neg x x d1 validate our We then describe the experimental H an2 embedding of This the path’s descriptors intoapproach. the from the path’s descriptors to each word in theV2model. ative examples. is achieved by first labeling consecutive edge pairs in the dHGD 2 H .. x i VAdapting . model. setup used to analyze the effect of the are feature parameters j codewordsthat of the the sequence embedding Finally, pathprocedure lengths randomly chosen from a lengt . . yields a feature vector of minimal distances has the graphs as positive or negative using the above dj x we find the minimum dN approach [58], distance inEuclidean the path classification steplearnt andone present results on the dataset. Thi distribution from graphs in the training same length as the number of elements in the of BoW model. and then constructing two separate distributions, for Hl x V.k x to each word from the path’s descriptors in the Thisour x isreconstruction achieved by firstapproach labeling consecutive edge pairs in th performance of for both tree To account for geometry and characterize paths x share themodel. assigned labels. HM dk each class, using VN that a feature that networks. has the graphs as positive or negative using the above procedur a common section, such as the oneyields shown in Fig. vector 4(a), of minimal distances and loopy same as the curnumber6of elements in the BoW model. and then constructing two separate distributions, one fo we incorporate into these descriptors thelength maximum I MPLEMENTATION D ETAILS To account for geometry and characterize paths that share each class, using the assigned labels. vature along the centerline curve C. It is computed as Fig. 9. Flowchart of our approach to extracting appearance In section, of the feature aunit common section, as this the one shownwe in first Fig. describe 4(a),and details 6.1 Datasets Parameters argmaxkT is the tangent vector.such extraction features0 (s))k, from where tubularT(s) paths of arbitrary length. and parameter selection steps. We then briefly D ETAILS we incorporate into these descriptors the maximum cur- 6 I MPLEMENTATION evaluated approach onofthe datasets depicted in describe procedures toas reduce the size thefive QMIP vature along the centerline curveseveral C.We It is computedour In this section, we first describe details of the featur 5.3histogram Collecting Training Datasegment, 0 produces R bins. For each this pairs problem instances prior to optimization. Fig. 1 and described in detail below: argmaxkT (s))k, where T(s) is the unit tangent vector. extraction and parameter selection steps. We then briefl of histograms, each one corresponding to aground radius interval. Given a set of training images {Ii }, the associated • Aerial: Aerial images roadprocedures networkstoobtained describe of several reduce thefrom size of the QMIP 6.1Data Feature Extraction truth tracingswe {Hinormalize } annotated each manually tubular Finally, histogram by graphs the number of 5.3 and Collecting Training problem prior toversions optimization. Google maps. We usedinstances 21 grayscale of these {Gpoints method sample images The HGD descriptors introduced 5.1 are comi } obtained thatusing votedthefor it. of Section Given a4.1, set we of training {Ii }, the associated ground images to train in ourSection path classifier and 10 for validation. an equal number of positive and negative paths from each puted for a group of overlapping segments on a Extraction path. To 6.1 Feature truth tracings manually and tubular graphs This yields a set of histograms for each{Hsegment, which i } annotated • Confocal-Vessels: Two image stacks of blood vessel {Gidescriptor. } obtained using the method of Section 4.1, we sample The HGD descriptors introduced in Section 5.1 are com we combine into a single HGD networks acquired with a confocal microscope. We an equal number of positive and negative paths from each puted for a group of overlapping segments on a path. T 5.2 Embedding The above procedure produces an arbitrary number of HGD descriptors per path. To derive a fixed-size descriptor from them, we first use a Bag-of-Words (BoW) approach to compactly represent their feature space. The words of the BoW model are generated during training by randomly sampling a predefined number of descriptors from the training data. For a given path of arbitrary length, we then compute an embedding of the path’s HGD descriptors into the codewords of the model. As illustrated in Fig. 9, computing the embedding amounts to finding the minimum Euclidean distance from the descriptors to each word in the model. The resulting feature vector of distances has the same length as the number of elements in the BoW model. To account for geometry and characterize paths that share a common section, such as the one shown in Fig. 4(a), we incorporate into these descriptors four geometry features: the maximum curvature along the centerline curve C(s), its tortuosity, and normalized length along the z and the scale axes defined in Section 4.1. For a path of length L in world coordinates and distance d between its start and end points, these features are defined respectively as argmaxkT0 (s)k, d/L, ∆z /L and ∆r /L, where ∆z and ∆r are the path lengths along the z and the scale axes, and T(s) is the unit tangent vector depicted in Fig. 8. 5.3 Training To train the path classifier, we obtain positive samples by simply sampling the ground truth delineations associated with our training images. To obtain negative samples, we first build tubular graphs in these training images using the method of Section 4.1. We then randomly select paths from these graphs and attempt to find matching paths in the ground truth tracings. They are considered as negative samples if they satisfy certain incompatibility conditions, such as having a small overlap with the matching path. We describe these conditions in Appendix F. Finally, at detection time, we run the path classifier on consecutive edge pairs and assign them the resulting probabilities of belonging to the underlying curvilinear networks. used a portion of one for training and both for testing. We only considered the red channel of these stacks since it is the only one used to label the blood vessels. • Brightfield: Six image stacks were acquired by brightfield microscopy from biocytin-stained rat brains. We used three for training and three for testing. • Confocal-Axons: 3D confocal microscopy image stacks of Olfactory Projection Fibers (OPF) of the Drosophila fly taken from the DIADEM competition [2]. The dataset consists of 8 image stacks of axonal trees, which we split into a training and a validation set, leaving 4 stacks in the latter. • Brainbow: Neurites of this dataset were acquired by targeting mice primary visual cortex using the brainbow technique [1] so that each has a distinct color. We used one image stack for training and three for testing. For all five datasets, we used a semi-automated delineation tool [60] we developed to obtain the ground truth tracings. We built the tubular graphs using a seed selection distance of d = 5∆I for the Confocal-Axons dataset and d = 20∆I for the other four, and a seed linking distance of l(d) = 5d, where ∆I denotes the minimum voxel spacing. In all our experiments, we used the same parameters to compute the HGD descriptors: segment length 2∆I ; segment sampling step 0.5∆I implying a 75% overlap; B = 9 histogram bins; R = 3 radius intervals; radius margin β = 0.6, and a BoW model of 300 codewords. 6.2 Path Classification We have tested our approach using five different classifiers: SVM classifiers with linear and RBF kernels, random forest (RF) classifiers with oblique and orthogonal splits and a gradient boosted decision tree (GDBT) classifier. To train the classifiers, we randomly sampled 30000 positive and 30000 negative paths from the graphs. For assessing the ROC performance, we used 10000 positive and negative samples at detection time. For image stacks of the Brainbow dataset, we extended the histograms of gradient deviation features described in Section 5.1 to include color information. This is done by SUBMITTED TO IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE ON 28 AUG 2014 1 1 GDBT RF−Orthogonal RF−Oblique SVM−Linear SVM−RBF 0.9 0.85 10 −3 10 −2 10 −1 Aerial Brightfield Conf.−Vessels Brainbow Conf.−Axons 0.9 0.85 0.8 −4 10 0 10 −3 10 FPR (a) −1 10 0.8 −4 10 0 10 Segment Length 0.95 0.95 # Bins: 2 # Bins: 6 # Bins: 9 # Bins: 12 −3 −2 10 −1 10 0.9 Segm. Length = 2.0 Segm. Length = 4.0 Segm. Length = 6.0 Segm. Length = 8.0 0.85 0 10 TPR 0.95 0.85 0.8 −4 10 −3 10 FPR (d) −2 10 FPR (e) −1 10 −1 10 0 10 0.8 −4 10 −3 10 −2 10 −1 10 k-MST [8] 0.87 0.90 0.91 0.74 NS [13] 0.58 0.65 0.42 0.58 OSnake [18] APP2 [62] 0.00 0.67 0.80 0.82 0.68 0.76 0.69 0.63 Reconstruction accuracy measured by the DIADEM [2] metric on four test stacks of the Confocal-Axons dataset. Each row corresponds to an image stack denoted by OPi. Higher scores are better. 0 10 FPR (f) Fig. 10. (a) ROC curves for the five classifiers evaluated on the Aerial dataset. (b) Curves for all the five datasets using the GDBT classifier. (c-f) Influence of the feature extraction parameters on the classification accuracy for the Aerial dataset using the same classifier. first converting the stacks into the CIELAB space and clustering their voxels using the K-Means algorithm with K = 50 clusters. For each cluster, we then computed a normalized gray scale image whose voxel values are inversely proportional to the color distance between the original voxel and the cluster mean. This results in K gray scale images and each voxel is associated to the one its cluster corresponds to. For a voxel along any given path, the gradient features are computed on this associated image. As a result, only gradients corresponding to voxels with a similar color are taken into account. Fig. 10(a) shows the classification performance achieved by the five classifiers on the validation set of the Aerial dataset. We will use the GDBT classifier in the remainder of the paper since it clearly outperforms the two RF ones, which both perform better than the SVMs. Fig. 10(b) shows the performance on all the five datasets using this classifier. The Brightfield and Confocal-Vessels datasets yield a lower true positive rate at a fixed false positive rate than the other three. In the case of the Brightfield dataset, this is because of the structure irregularities and significant point spread function blurring. In the Confocal-Vessels case, it is mainly due to non-uniformly stained blood vessels. Fig. 10(c-f) provides an analysis of the effect of the feature parameter settings on classification accuracy. Briefly, increasing the number of radius intervals, histogram bins and codebook entries improves the accuracy at the expense of a higher computational cost. On the other hand, short segment lengths provide a fine grained sampling of the paths and hence slightly better performance. 6.3 Arbor-IP 0.89 0.90 0.93 0.90 TABLE 1 0 CB Size = 10 CB Size = 20 CB Size = 60 CB Size = 100 CB Size = 300 0.9 0.85 Loopy-IP 0.91 0.91 0.94 0.90 10 Codebook Size 1 0.9 −2 10 (c) 1 10 −3 10 FPR 1 0.8 −4 10 Radius Interv.: 1 Radius Interv.: 2 Radius Interv.: 3 Radius Interv.: 4 (b) TPR TPR −2 10 0.9 0.85 FPR Histogram Bins OPF4 OPF6 OPF7 OPF8 0.95 TPR TPR TPR 1 0.95 0.95 0.8 −4 10 Radius Intervals Datasets Classification Algorithms 11 Neural Structures The neurites of the Confocal-Axons, Brightfield and Brainbow datasets form tree structures without cycles. However, in the latter two datasets, because of the low z-resolution, disjoint branches appear to cross, introducing false loops. As described in Section 4.5, these loops can be successfully Loopy-IP Arbor-IP k-MST [8] BRBW1 BRBW2 BRBW3 BRF1 BRF2 0.65 0.56 0.66 0.76 0.50 0.44 0.32 0.45 0.59 0.32 0.37 0.30 0.41 0.51 0.28 BRF3 0.80 0.71 0.47 TABLE 2 DIADEM scores on the Brainbow and Brightfield datasets. All three methods accept multiple roots and produce a connected delineation for each one. Columns BRBWi and BRFi denote stacks from the Brainbow and Brightfield datasets. identified and a tree topology can be recovered from a loopy solution by tracing the active edge pairs from the root vertices to the branch terminals. Fig. 11 shows two tree reconstructions we obtained for each of these datasets using the formulation of Section 4.5. Additional ones are supplied as supplementary material. For quantitative evaluation, we use the DIADEM metric [2], which is designed to compare topological accuracy of a reconstructed tree against a ground truth tree. Misconnections in the reconstruction are weighted by the size of the subtree to which the associated ground truth connection leads. As a result, connections that are close to the given root vertices have a higher importance than those that are close to the branch terminals. We compare our tree and loopy reconstruction approaches, which we refer to as Arbor-IP and Loopy-IP, to several state-of-the-art algorithms. They are the pruningbased approach (APP2) of [62], the active contour algorithm (OSnake) of [18], the NeuronStudio (NS) software [13], the focus-based depth estimation method (Focus) of [63], and finally the k-MST technique of [8], the last two of which were finalists in the DIADEM competition. For all these algorithms, we used the implementations provided by their respective authors with default parameters. Table 1 provides the DIADEM scores for the four images of the Confocal-Axons dataset. The Loopy-IP and ArborIP perform similarly mainly because the images of this dataset have a sufficiently high z-resolution and the axons contain only a few false cycles. However, for the Brightfield and Brainbow datasets, the Loopy-IP approach systematically outperforms the Arbor-IP one as can be seen in Table 2, which demonstrates the effectiveness of the loopy formulation in dealing with spurious branch crossings. Note from Fig. 11 that the images of these two datasets contain multiple tree structures (i.e., |Vr | > 1). That is why, for each image, we compute a weighted average of the scores, where the weights are the ratio of the individual tree lengths SUBMITTED TO IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE ON 28 AUG 2014 12 Fig. 11. Delineation results, best viewed in color. Top Rows: For each dataset, two minimal or maximal projections and overlaid delineation results. Each connected curvilinear network is shown in a distinct color. Bottom Row: Four road images with final delineations shifted and overlaid to allow comparisons. The renderings are created using the Vaa3D software [61]. to the total length of all the trees in the ground truth. We also evaluated the APP2 [62], OSnake [18], and Focus [63] algorithms on the Brightfield dataset. Since they do not allow the user to provide multiple root vertices, the DIADEM score of their output cannot be computed. To compare their algorithms to ours, we therefore used the NetMets [64] measure, which does not heavily rely on the roots. Similar to the DIADEM metric, this measure takes as input the reconstruction, the corresponding ground truth tracings and a sensitivity parameter σ, which is set to 3∆I in all our experiments. However, it provides a relatively local measure compared to the DIADEM scores, since it does not take into account the network topology. Table 3 shows the NETMETS scores on the test im- SUBMITTED TO IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE ON 28 AUG 2014 Loopy-IP k-MST [8] Focus [63] OSnake [18] APP2 [62] 0.05 0.10 0.39 0.66 0.68 BRF1 0.29 0.44 0.54 0.63 0.64 0.71 0.79 0.75 0.98 1.00 0.65 0.88 1.00 0.99 1.00 0.11 0.11 0.49 0.66 0.63 BRF2 0.29 0.53 0.53 0.59 0.54 0.81 0.84 0.90 0.99 1.00 0.78 0.91 1.00 1.00 1.00 0.07 0.13 0.38 0.69 0.65 BRF3 0.28 0.35 0.46 0.38 0.49 0.77 0.81 0.74 0.95 1.00 0.70 0.92 1.00 0.99 1.00 TABLE 3 NetMets [64] scores on the Brightfield dataset. The NetMets software outputs four values for each trial, which are geometric False Positive Rate (FPR), geometric False Negative Rate (FNR), connectivity FPR, and connectivity FNR, respectively from left to right. ages of the Brightfield dataset. Note that the Focus [63] algorithm is specifically designed for brightfield image stacks distorted by a point spread function. Our approach nevertheless brings about a significant improvement except in one case (BRF3 - connectivity FPR), where the best algorithm, Focus, yields a very high false negative rate. 6.4 Roads and Blood Vessels The roads and blood vessels of the Aerial and ConfocalVessel datasets form networks, in which there are many real cycles. Many road networks of the Aerial dataset are partially occluded by trees or shadows cast by them, making the delineation task challenging. However, as can be seen in the last row of Fig. 11, in spite of the occlusions, the road networks are recovered almost perfectly. The main errors are driveways that are treated as very short roads, thin side streets mistaken as high-ways and a few roads deadending because the connecting path to the closest junction is severely occluded. The first one could be addressed by introducing a semantic threshold on short overhanging segments while the latter two would require a much more sophisticated semantic understanding. The quality of the blood vessel delineations depicted by the second row is much harder to assess on the printed page but becomes clear when looking at the rotating volumes available as supplementary material. BRBW1 BRBW2 BRBW3 Network Flow 166.7 132.9 291.1 Subset + Lazy 32.9 4.9 19.5 # Lazy Constr. 1008 892 1092 7 objective function. Unlike most earlier techniques, it explicitly handles the fact that the structures may be cyclic and builds graphs in which vertices may belong to more than one branch. Another key ingredient is a classification-based approach to scoring the quality of paths. This combination overall allows us to outperform state-of-the-art methods without having to manually tune the algorithm parameters for each new dataset. R EFERENCES [1] [2] [3] [4] [5] [6] [8] [9] [10] [11] [12] C ONCLUSION We have proposed a novel graph-based approach to delineating complex curvilinear structures that lets us find the desired network as the global optimum of a well- designed BRF2 BRF3 CONV1 CONV2 17.2 4.9 8.4 106.8 10.9 7.7 0.6 4.2 1165 2259 972 5175 Run-times for the network flow and subset formulations of Sections 4.5.2 and 4.6 respectively. The run-times are averaged over 3 runs per image and measured in minutes on a multi-core machine. While the flow constraints are imposed all at once, the subset ones are added lazily as described in the Section 4.6. The last row shows the total number lazy constraints added in the branch-and-cut search before the global optimum is reached. CONVi denotes an image stack from the Confocal-Vessels dataset. Run Time Analysis Table 4 provides running times for our network flow and subset approaches described in Section 4.5.2. We solve both problems to optimality, which results in exactly the same solution. As described in Section 4.6, the latter is solved iteratively by identifying violations of the connectivity constraints of Eq. 30 and adding them to the problem. As can be seen from the table, in practice, only a small fraction of these constraints are needed to ensure connectivity, and the approach results in significant speed-ups, especially on those problem instances that are hard-to-solve using the network flow formulation. BRF1 379.3 36.6 1556 TABLE 4 [7] 6.5 13 [13] J. Livet, T. Weissman, H. Kang, R. Draft, J. Lu, R. Bennis, J. Sanes, and J. Lichtman, “Transgenic Strategies for Combinatorial Expression of Fluorescent Proteins in the Nervous System,” Nature, vol. 450, no. 7166, pp. 56–62, 2007. G. A. Ascoli, K. Svoboda, and Y. Liu, “Digital Reconstruction of Axonal and Dendritic Morphology Diadem Challenge,” 2010. H. Peng, F. Long, T. Zhao, and E. Myers, “Proof-editing is the Bottleneck of 3D Neuron Reconstruction: The Problem and Solutions,” Neuroinformatics, vol. 9, no. 2, pp. 103–105, 2011. Y. Wang, A. Narayanaswamy, C. Tsai, and B. Roysam, “A Broadly Applicable 3D Neuron Tracing Method Based on Open-Curve Snake,” Neuroinformatics, vol. 9, no. 2-3, pp. 193–217, 2011. P. Chothani, V. Mehta, and A. Stepanyants, “Automated Tracing of Neurites from Light Microscopy Stacks of Images,” Neuroinformatics, vol. 9, pp. 263–278, 2011. E. Bas and D. Erdogmus, “Principal Curves as Skeletons of Tubular Objects - Locally Characterizing the Structures of Axons,” Neuroinformatics, vol. 9, no. 2-3, pp. 181–191, 2011. T. Zhao, J. Xie, F. Amat, N. Clack, P. Ahammad, H. Peng, F. Long, and E. Myers, “Automated Reconstruction of Neuronal Morphology Based on Local Geometrical and Global Structural Models,” Neuroinformatics, vol. 9, pp. 247–261, May 2011. E. Turetken, G. Gonzalez, C. Blum, and P. Fua, “Automated Reconstruction of Dendritic and Axonal Trees by Global Optimization with Geometric Priors,” Neuroinformatics, vol. 9, no. 2-3, 2011. A. Choromanska, S. Chang, and R. Yuste, “Automatic Reconstruction of Neural Morphologies with Multi-Scale Graph-Based Tracking,” Frontiers in Neural Circuits, vol. 6, no. 25, 2012. F. Benmansour and L. Cohen, “Tubular Structure Segmentation Based on Minimal Path Method and Anisotropic Enhancement,” International Journal of Computer Vision, vol. 92, no. 2, 2011. E. Turetken, F. Benmansour, and P. Fua, “Automated Reconstruction of Tree Structures Using Path Classifiers and Mixed Integer Programming,” in Conference on Computer Vision and Pattern Recognition, June 2012. E. Turetken, F. Benmansour, B. Andres, H. Pfister, and P. Fua, “Reconstructing Loopy Curvilinear Structures Using Integer Programming,” in Conference on Computer Vision and Pattern Recognition, June 2013. S. Wearne, A. Rodriguez, D. Ehlenberger, A. Rocher, S. Henderson, and P. Hof, “New Techniques for Imaging, Digitization and Analysis of Three-Dimensional Neural Morphology on Multiple Scales,” Neuroscience, vol. 136, no. 3, pp. 661–680, 2005. SUBMITTED TO IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE ON 28 AUG 2014 [14] H. Cuntz, F. Forstner, A. Borst, and M. Häusser, “One Rule to Grow Them All: A General Theory of Neuronal Branching and Its Practical Application,” PLoS Computational Biology, vol. 6, 2010. [15] D. R. Myatt, T. Hadlington, G. A. Ascoli, and S. J. Nasuto, “Neuromantic - from Semi Manual to Semi Automatic Reconstruction of Neuron Morphology,” Frontiers in Neuroinformatics, vol. 6, 2012. [16] M. Longair, D. A. Baker, and J. D. Armstrong, “Simple Neurite Tracer: Open Source Software for Reconstruction, Visualization and Analysis of Neuronal Processes,” Bioinformatics, vol. 27, 2011. [17] H. Peng, F. Long, and G. Myers, “Automatic 3D Neuron Tracing Using All-Path Pruning,” Bioinformatics, vol. 27, 2011. [18] Y. Wang, A. Narayanaswamy, and B. Roysam, “Novel 4D OpenCurve Active Contour and Curve Completion Approach for Automated Tree Structure Extraction,” in Conference on Computer Vision and Pattern Recognition, 2011, pp. 1105–1112. [19] George Mason University, “NeuroMorpho.Org,” http://neuromorpho.org. [20] J. C. Fiala, “Reconstruct: A Free Editor for Serial Section Microscopy,” Journal of microscopy, vol. 218, pp. 52–61, 2005. [21] K. Brown, D. E. Donohue, G. D’Alessandro, and G. A. Ascoli, “A Cross-Platform Freeware Tool for Digital Reconstruction of Neuronal Arborizations from Image Stacks,” Neuroinformatics, vol. 3, no. 4, pp. 343–359, 2005. [22] Bitplane, “Imaris: 3D and 4D Real-Time Interactive Image Visualization,” 2013, http://www.bitplane.com/go/products/imaris/. [23] E. Meijering, M. Jacob, J.-C. F. Sarria, P. Steiner, H. Hirling, and M. Unser, “Design and Validation of a Tool for Neurite Tracing and Analysis in Fluorescence Microscopy Images,” Cytometry Part A, vol. 58A, no. 2, pp. 167–176, April 2004. [24] Z. Fanti, F. F. De-Miguel, and M. E. Martinez-Perez, “A Method for Semiautomatic Tracing and Morphological Measuring of Neurite Outgrowth from DIC Sequences,” in Engineering in Medicine and Biology Society, 2008, pp. 1196–1199. [25] R. Srinivasan, Q. Li, X. Zhou, J. Lu, J. Lichtman, and S. T. C. Wong, “Reconstruction of the Neuromuscular Junction Connectome,” Bioinformatics, vol. 26, no. 12, pp. 64–70, 2010. [26] H. Peng, Z. Ruan, D. Atasoy, and S. Sternson, “Automatic Reconstruction of 3D Neuron Structures Using a Graph-Augmented Deformable Model,” Bioinformatics, vol. 26, no. 12, 2010. [27] J. Lu, “Neuronal Tracing for Connectomic Studies,” Neuroinformatics, vol. 9, no. 2-3, pp. 159–166, 2011. [28] D. Donohue and G. Ascoli, “Automated Reconstruction of Neuronal Morphology: An Overview,” Brain Research Reviews, vol. 67, 2011. [29] E. Meijering, “Neuron Tracing in Perspective,” Cytometry Part A, vol. 77, no. 7, pp. 693–704, 2010. [30] A. Perez-Rovira and E. Trucco, “Robust Optic Disc Location via Combination of Weak Detectors,” in Engineering in Medicine and Biology Society Conference, August 2008, pp. 3542–3545. [31] S. Urban, S. O’Malley, B. Walsh, A. Santamarı́a-Pang, P. Saggau, C. Colbert, and I. Kakadiaris, “Automatic Reconstruction of Dendrite Morphology from Optical Section Stacks,” in CVAMIA, 2006. [32] C. Weaver, P. Hof, S. Wearne, and L. Brent, “Automated Algorithms for Multiscale Morphometry of Neuronal Dendrites,” Neural Computation, vol. 16, no. 7, pp. 1353–1383, 2004. [33] T. Lee, R. Kashyap, and C. Chu, “Building Skeleton Models via 3D Medial Surface Axis Thinning Algorithms,” Computer Vision, Graphics, and Image Processing, vol. 56, no. 6, pp. 462–478, 1994. [34] K. Palagyi and A. Kuba, “A 3D 6-Subiteration Thinning Algorithm for Extracting Medial Lines,” Pattern Recognition, vol. 19, no. 7, pp. 613–627, 1998. [35] M. Pool, J. Thiemann, A. Bar-Or, and A. E. Fournier, “Neuritetracer: A Novel Imagej Plugin for Automated Quantification of Neurite Outgrowth,” Journal of Neuroscience Methods, vol. 168, no. 1, 2008. [36] J. Xu, J. Wu, D. Feng, and Z. Cui, “Dsa Image Blood Vessel Skeleton Extraction Based on Anti-Concentration Diffusion and Level Set Method,” Computational Intelligence and Intelligent Systems, vol. 51, pp. 188–198, 2009. [37] H. Cai, X. Xu, J. Lu, J. Lichtman, S. Yung, and S. Wong, “Repulsive Force Based Snake Model to Segment and Track Neuronal Axons in 3D Microscopy Image Stacks,” NeuroImage, vol. 32, no. 4, 2006. [38] Z. Vasilkoski and A. Stepanyants, “Detection of the Optimal Neuron Traces in Confocal Microscopy Images,” Journal of Neuroscience Methods, vol. 178, no. 1, pp. 197–204, 2009. [39] A. Can, H. Shen, J. Turner, H. Tanenbaum, and B. Roysam, “Rapid Automated Tracing and Feature Extraction from Retinal Fundus Images Using Direct Exploratory Algorithms,” Transactions on Information Technology in Biomedicine, vol. 3, no. 2, 1999. 14 [40] K. Al-Kofahi, S. Lasek, D. Szarowski, C. Pace, G. Nagy, J. Turner, and B. Roysam, “Rapid Automated Three-Dimensional Tracing of Neurons from Confocal Image Stacks,” Transactions on Information Technology in Biomedicine, vol. 6, no. 2, pp. 171–187, 2002. [41] T. Yedidya and R. Hartley, “Tracking of Blood Vessels in Retinal Images Using Kalman Filter,” in Digital Image Computing: Techniques and Applications, 2008, pp. 52–58. [42] A. Mukherjee and A. Stepanyants, “Automated Reconstruction of Neural Trees Using Front Re-Initialization,” in SPIE, 2012. [43] M. Law and A. Chung, “Three Dimensional Curvilinear Structure Detection Using Optimally Oriented Flux,” in European Conference on Computer Vision, 2008. [44] M. Law and A. Chung, “An Oriented Flux Symmetry Based Active Contour Model for Three Dimensional Vessel Segmentation,” in European Conference on Computer Vision, 2010, pp. 720–734. [45] L. Domanski, C. Sun, R. Hassan, P. Vallotton, and D. Wang, “Linear Feature Detection on Gpus,” in DICTA, 2010. [46] J. D. Wegner, J. A. Montoya-Zegarra, and K. Schindler, “A HigherOrder CRF Model for Road Network Extraction,” in Conference on Computer Vision and Pattern Recognition, 2013. [47] D. Fan, “Bayesian Inference of Vascular Structure from Retinal Images,” Ph.D. dissertation, Dept. of Computer Science, U. of Warwick, Coventry, UK, 2006. [48] K. Sun, N. Sang, and T. Zhang, “Marked Point Process for Vascular Tree Extraction on Angiogram,” in Conference on Computer Vision and Pattern Recognition, 2007, pp. 467–478. [49] M. Fischler, J. Tenenbaum, and H. Wolf, “Detection of Roads and Linear Structures in Low-Resolution Aerial Imagery Using a Multisource Knowledge Integration Technique,” Computer Vision, Graphics, and Image Processing, vol. 15, no. 3, pp. 201–223, 1981. [50] H. Li and A. Yezzi, “Vessels as 4-D Curves: Global Minimal 4D Paths to Extract 3-D Tubular Surfaces and Centerlines,” IEEE Transactions on Medical Imaging, vol. 26, no. 9, 2007. [51] J. Staal, M. Abramoff, M. Niemeijer, M. Viergever, and B. van Ginneken, “Ridge Based Vessel Segmentation in Color Images of the Retina,” IEEE Transactions on Medical Imaging, 2004. [52] E. Turetken, C. Becker, P. Glowacki, F. Benmansour, and P. Fua, “Detecting Irregular Curvilinear Structures in Gray Scale and Color Imagery Using Multi-Directional Oriented Flux,” in International Conference on Computer Vision, December 2013. [53] J. A. Sethian, Level Set Methods and Fast Marching Methods Evolving Interfaces in Computational Geometry, Fluid Mechanics, Computer Vision, and Materials Science. Cambridge University Press, 1999. [54] M. Dorigo and T. Stütale, Ant Colony Optimization. MIT press, 2004. [55] C. Duhamel, L. Gouveia, P. Moura, and M. Souza, “Models and Heuristics for a Minimum Arborescence Problem,” Networks, vol. 51, no. 1, pp. 34–47, 2008. [56] Gurobi Optimizer, 2014, http://www.gurobi.com/. [57] D. Weinland, M. Ozuysal, and P. Fua, “Making Action Recognition Robust to Occlusions and Viewpoint Changes,” in European Conference on Computer Vision, September 2010. [58] N. Dalal and B. Triggs, “Histograms of Oriented Gradients for Human Detection,” in Conference on Computer Vision and Pattern Recognition, 2005. [59] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan, “Object Detection with Discriminatively Trained Part Based Models,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010. [60] E. Turetken, F. Benmansour, and P. Fua, “Semi-Automated Reconstruction of Curvilinear Structures in Noisy 2D Images and 3D Image Stacks,” EPFL, Tech. Rep., 2013. [61] H. Peng, Z. Ruan, F. Long, J. Simpson, and E. Myers, “V3D Enables Real-Time 3D Visualization and Quantitative Analysis of LargeScale Biological Image Data Sets,” Nature Biotechnology, vol. 28, no. 4, pp. 348–353, 2010. [62] H. Xiao and H. Peng, “APP2: Automatic Tracing of 3D Neuron Morphology Based on Hierarchical Pruning of a Gray-weighted Image Distance-Tree,” Bioinformatics, vol. 29, no. 11, 2013. [63] A. Narayanaswamy, Y. Wang, and B. Roysam, “3D Image PreProcessing Algorithms for Improved Automated Tracing of Neuronal Arbors,” Neuroinformatics, vol. 9, no. 2-3, pp. 219–231, 2011. [64] D. Mayerich, C. Bjornsson, J. Taylor, and B. Roysam, “Netmets: Software for Quantifying and Visualizing Errors in Biological Network Segmentation,” BMC Bioinformatics, vol. 13, 2012.
© Copyright 2025 Paperzz