A Massively Parallel Algorithm for Polyline Simplification Using an Associative Computing Model Huy Tran Department of Computing Sciences Texas A&M University Corpus Christi Email: [email protected] Michael Scherger Department of Computing Sciences Texas A&M University Corpus Christi Email: [email protected] Abstract - Line simplification is a process of reducing the number of line segments to represent a polyline. This reduction in the number of line segments and vertices can improve the performance of spatial analysis applications. The classic DouglasPeucker algorithm developed in 1973 has a complexity of O(mn), where n denotes the number of vertices and m the number of line segments. Its enhanced version proposed in 1992 has a complexity of O(nlogn). In this paper, we present a parallel line simplification algorithm using a parallel Multipleinstruction-stream Associative Computing model (MASC). Using one instruction stream of the MASC model, our algorithm has a parallel complexity of O(n) in the worst case using n processing elements. remaining set of line segments to reconstruct the polyline (O(n2) searches). Keywords: Parallel algorithms, associative computing, SIMD algorithms, line simplification, vertex elimination, level curve 1. Introduction 2D planar level curves are the polylines where mathematical functions take on constant values. In applications such as AutoCAD or MATLAB, these planar level curves are represented as collections of line segments. An example of a level curve in AutoCAD is shown in Figure 1. The number of digitized line segments collected is far more than necessary [2]. Due to the complexity of the geospatial functions, the number of line segments to represent the planar level curve can be very large, which may cause inefficiencies in visual performance. Therefore, the polyline needs to be represented with fewer segments and vertices. It is necessary to perform a polyline simplification algorithm on a 2D planar level curve. If the line segments were acquired in a stream order, then the end vertex of one line segment is the beginning vertex of the next line segment in the file. It would be straightforward to apply a polyline simplification algorithm. However, in this problem, the line segments of polylines are digitized in a raster scan order. The raster scan ordering of the line segments requires intensive searching on the Figure 1: An example of a level curve. The Douglas-Peucker line simplification algorithm is considered an effective line simplification algorithm [2, 13]. The algorithm uses the closeness of a vertex to a segment as a rejection condition. Its worst-case complexity is O(mn), where n denotes the number of vertices and m the number of segments. Furthermore, in 1992 Hershberger and Snoeyink introduced an improvement for DouglasPeucker algorithm to gain an enhanced O(nlogn) time complexity [4]. The speeding up is achieved by applying binary searches to maintain the path hulls of subchains. Different approaches to this issue have also been discussed in [5, 10, and 12]. However, even the worst-case complexities O(mn) and O(nlogn) are considered computationally expensive when it comes to work with significantly big visualizations. In this paper, we present a polyline simplification algorithm using the Multipleinstruction-stream Associative Computing model (MASC) [6, 8] to reduce the number of vertices required to represent polylines. MASC is an enhanced SIMD model with associative properties. By using the constant global operations of the MASC model, our algorithm has a parallel complexity of linear time O(n) in the worst case. This paper is organized as follows. Section 2 will discuss the polyline simplification in more details and provide a sequential algorithm which our parallel algorithm is based upon. Section 3 will introduce the MASC model of computation and its properties. Section 4 will describe our massively parallel line simplification algorithm using the MASC model. Finally, section 5 will provide the discussions on future work and conclusion. 2. Polyline Simplification and a Sequential Algorithm Consider the line segments shown in Figure 2. Two connected segments, for example AB and CD, can have one of the four following arrangements: as shown in Figure 3, five line segments have been digitized. After rearrangement, the stream order would be: B2 A2 A1 B1 B3 A3 A4 B4 A5 B5. Then, each vertex will be checked with its next vertex for coincidence and eliminated accordingly. In this example, points A1, B1, A3, B4 will be eliminated due to coincidence, and points A4, B3 will be deleted due to collinearity. B4 4 A4 5 B5 A5 A3 3 B1 1 B3 A2 A1 2 B2 A is coincident or near-coincident to C. A is coincident or near-coincident to D. B is coincident or near-coincident to C. B is coincident or near-coincident to D. The coincidence or collinearity of vertices is determined by investigating their coordinates (x, y). Two near-coincident vertices are considered coincident if their distance is less than or equal to an accepted tolerance α. Actual representation with point B ≈ D α (α: accepted tolerance) Digitized order: A B … C D (D is not next to B in the record file) Figure 2: An example of how two connected segments are not in a stream order 2.1. A specific problem In order to perform the polyline simplification, the digitized line segments in the file need to be re-arranged. The random nature of the digitized line segments necessitates a massive number of search operations to determine coincident points. This is illustrated by the following sequential algorithm to re-arrange these line segments into a stream where line segments having coincident vertices are moved closed to each other. For example, A1(0, 3) B1(-1, 3) A2(0, 3) B2(1, 2) A3(-3, 3) B3(-1, 3) A4(-3, 3) B4(-4,3) A5(-4, 3) B5(-5, 2) Segments_Example Figure 3: An example of five segments with integer coordinates 2.2. A sequential algorithm An algorithm to simplify line segments is constructing polylines from coincident and collinear vertices. This can be obtained by eliminating vertices whose distances to the prior initial vertex are less than a maximum accepted tolerance α. The vertices having further distance to the initial vertex (> α) could be considered as part of a different polyline. However, finding the coincident and collinear vertices is expensive in this problem. We call segArray the array of line segments. The sequential algorithm is the following: Begin 1. Set pointer current to the first segment in segArray (current=0) 2. While current does not reach the end of segArray 2.1. Set pointer next to the next segment of current segment (next=current+1) 2.2. While next does not reach the end of segArray a. Check if the segment in next has coincident vertices with current segment (the four arrangements discussed in section 2.1) b. If yes, invert next segment if needed (in case of the second and forth arrangements in section 2.1) c. Move the next segment closed to the current segment in the array d. Move pointer current to the next segment (current+=1) e. Repeat step 2.2 2.3. Move pointer current to the next segment of current segment (current+=1) 2.4. Repeat step 2 End The sequential algorithm above is to rearrange line segments into a stream order. The mechanism is similar to the selection sort. The algorithm requires searching all line segments for every investigated line segment to look for the line segment having coincident vertex and move it to the right place. This ineffective searching and sorting can be noticed by the usage of two while loops in step 2 and 2.2. Consequently, the complexity of this algorithm is O(n2), where n is the number of vertices. Memory PE Memory PE Memory PE T_route PE Memory PE T_local Instruction Stream ... T_I/O Memory Instruction Stream Instruction Stream Instruction Stream Interconnection Network PE Broadcast / Reduction Network Memory ... Cell Interconnection Network 3. The MASC model The following is a description of the Multiple Associative Computing (MASC) model of parallel computation. As shown in Figure 4, the MASC model consists of an array of processor-memory pairs called cells and an array of instruction streams. T_bcast T_reduce T_sync Figure 4: Conceptual view of MASC A MASC machine with n cells and j instruction streams is denoted as MASC(n, j). It is expected that the number of instruction stream processors be much less than the number of cells. The model also includes three virtual networks: 1. A cell network used for cell-to-cell communication. This network is used for the parallel movement of data between cells. This network could be a linear array, mesh, hypercube, or a dynamic interconnection network. 2. A broadcast/reduction network used for communication between an instruction stream and a set of cells. This network is also capable of performing common reduction operations. 3. An instruction stream network used for interinstruction stream communication. Cells can receive their next set of instructions to execute from the instruction stream broadcast network. Cells can be instructed from their current instruction stream to send and receive messages to other cells in the same partition using some communication pattern via the cell network. Each instruction stream processor is also connected to two interconnection networks. An instruction stream processor broadcasts instructions to the cells using the instruction stream broadcast network. The instruction streams also may need to communicate and may do so using the instruction stream network. Any of these networks may be virtual and be simulated by whatever network is present. MASC provides one or more instruction streams. Each active instruction stream is assigned to a unique dynamic partition of cells. This allows a task that is being executed in a data parallel fashion to be partitioned into two or more data parallel tasks using control parallelism. The multiple IS‟s supported by the MASC model allows for greater efficiency, flexibility, and re-configurability than is possible with only one instruction stream. While SIMD architectures can execute data parallel programs very efficiently and normally can obtain near linear speedup, data parallel programs in many applications are not completely data parallel and contain several non-trivial regions where significant branching occurs [3]. In these parallel programming regions, only a subset of traditional SIMD processors can be active at the same time. With MASC, control parallelism can be used to execute these different branches simultaneously. Other MASC properties include: The cells of the MASC model consist of a processing element (PE) and local memory. The accumulated memory of the MASC model consists of an array of cells. There is no shared memory between cells. Each instruction stream is a processor with a bus or broadcast/reduction network to all cells. Each cell listens to only one instruction stream and initially, all cells listen to the same instruction stream. The cells can switch to another instruction stream in response to commands from the current instruction stream. An active cell executes the commands it receives from its instruction stream, while an inactive cell listens to but does not execute the command from its instruction stream. Each instruction stream has the ability to unconditionally activate all cells listening to it. Cells without further work are called idle cells and are assigned to a specified instruction stream, which among other tasks manages the idle cells. The average time for a cell to send a message through the cell network to another cell is characterized by the parameter troute. Each cell also can read or write a word to an I/O channel. The maximum time for a cell to execute a command is given by the parameter tlocal. The time to perform a broadcast of either data or instructions is given by the predictability parameter tbcast. The time to perform a reduction operation is given by the predictability parameter treduce. The time for a cell to perform this I/O transfer is characterized by the parameter ti/o. The time to perform instruction stream synchronization is characterized by the parameter tsynch. An instruction stream can instruct its active cells to perform an associative search in time tbcast + tlocal + treduce. Successful cells are called responders, while unsuccessful cells are called non-responders. The instruction stream can activate either the set of responders or the set of non-responders. It can also restore the previous set of active cells in tbcast + tlocal time. Each instruction stream has the ability to select an arbitrary responder from the set of active cells in tbcast + tlocal time. An active instruction stream can compute the OR, AND, Greatest Lower Bound, or Least Upper Bound of a set of values in all active cells in treduce time. An idle cell can be dynamically allocated to an instruction stream in tsynch + tbcast time. These predictability parameters were identified using an object oriented description of the MASC model in [11]. They were developed to identify the performance costs using different architecture classes of parallel computing equipment. When the MASC model is implemented using a traditional SIMD computer, it is highly deterministic and the predictability costs can often be calculated and are often “best possible” [7]. Many of the predictability parameters for MASC operations become fixed or operate in one step [7]. 4. A MASC Line Simplification Algorithm Realizing the inefficiency of searching and sorting when coping with the problem as discussed in section 2, we adopt global constant time operations of the MASC model to avoid such inefficiency. Consider the simple example with five segments having integer-coordinate vertices as shown in Figure 3. In the example, coincident vertices have the same value of coordinates, and three vertices are called collinear if the triangle composed by them has an area value of zero. This can be adjusted in the functions to check coincidence and collinearity by adding an accepted tolerance α. The input data are described as in Figure 3 as well. Every line of the input file is a line segment consisting of two vertices. Each vertex has an xcoordinate and a y-coordinate. Therefore, at the first state we can determine the vertex‟s left or right neighbor, which is the other point in the same segment. We use a tabular organization similar to the one illustrated in Figure 5 as the data structure in our algorithm. That is, the information about left and right neighbors (left$ and right$) of the currently investigated vertex and its coincident vertex (coin$ if any) are stored in each PE. Vertex A is called on the left of vertex B if A‟s x-coordinate is less than B‟s or if A‟s y-coordinate is less than B‟s when A and B have the same x value. Vertex A is called on the right of vertex B if A‟s x-coordinate is greater than B‟s or if A‟s y-coordinate is greater than B‟s when A and B have the same x value. In addition to those location variables, two more variables are defined: visited$ for tracking if the vertex has been visited and delete$ for showing if the vertex should be eliminated or not. Furthermore, every vertex is assigned to one PE in the MASC model, which results in a massive number of processing elements. MASC_LINE_SIMPLIFICATION Algorithm Begin 1. Set all PEs to active 2. Set del$ = „No‟, visited$ = „No‟ 3. Set left$/right$ to the other vertex of the segment 4. For all PEs, repeat until no visited$ = „No‟ 4.1. Find the coincident vertex 4.2. If there is no responder (no coincident vertex) 4.2.1. Set visited$ = „Yes‟ 4.3. Get the vertex from the responder (if any) 4.4. Set empty left$/right$ of the two coincident vertices to its coincident vertex 4.5. Check if left$/right$ of the two coincident vertices and themselves are collinear 4.5.1. If not: a) Set the current PE‟s del$ = „Yes‟, del$ = „No‟ b) Update field having the deleted vertex as neighbor to its coincident vertex (responders) c) Set visited$ of both vertices = „Yes‟ d) Clear coin$ of both vertices 4.5.2. Else if they are collinear: a) Set both vertices‟ del$ to „Yes‟ b) Set the current PE‟s visited$ = „Yes‟ c) Update fields that have the deleted vertices (responders) as neighbor i. If the deleted vertex is in left$, update to left$ of the deleted vertices ii. Else if the deleted vertex is in right$, update right$ of the deleted vertices d) Clear coin$ of both vertices algorithm is O(n) in the worst case when there is no coincidence between vertices. PE PE PE PE PE PE PE PE PE PE left$ B1 right$ coin$ A1 B2 A2 B3 A3 B4 A4 B5 A5 visited$ No No No No No No No No No No del$ No No No No No No No No No No Figure 5: The initial table PE PE PE PE PE PE PE PE PE PE vertex A1 B1 A2 B2 A3 B3 A4 B4 A5 B5 left$ B1 A2 A2 right$ A2 A2 B2 coin$ B3 A3 B4 A4 B5 A5 visited$ Yes No Yes No No No No No No No del$ Yes No No No No No No No No No Figure 6: The table after one iteration End Using the MASC model, our algorithm does not have to re-arrange the line segments because it takes advantage of associative searching. The operations “Find its coincident vertex” in step 4.1 and “Find vertices that have it as neighbor” in step 4.5.1b and 4.5.2c return values in constant time. After the program finishes (all visited$ are „Yes‟), there would be vertices whose del$ is „No‟. Those remaining vertices belong to the simplified polylines of the level curve‟s visual representation. The directions of remaining vertices are maintained with their left$ and right$ neighbors. Figure 5 illustrates the initial table representing the original digitized vertices. During each iteration, a vertex is used in an associative search for its coincident vertex. Then, it checks their neighbors if they are collinear points. Appropriate actions are executed to guarantee that after every round of iteration, there is no deleted vertex in the table, and all vertices will be visited after the program finishes. Figure 6 demonstrates the table after one iteration. The associative searching capabilities of the MASC model helps each round of iteration take constant time. Figure 7 shows the final state of the table after all vertices are visited. Figure 8 is the resultant polyline constructed by fewer segments and vertices. The running time of the vertex A1 B1 A2 B2 A3 B3 A4 B4 A5 B5 PE PE PE PE PE PE PE PE PE PE vertex A1 B1 A2 B2 A3 B3 A4 B4 A5 B5 left$ B5 B5 A2 A2 B5 B5 B5 A5 B5 right$ A2 A2 B2 A2 A2 A2 A2 A2 A5 coin$ visited$ Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes del$ Yes Yes No No Yes Yes Yes Yes No No Figure 7: The tables after all nodes are visited Figure 8: The resultant polyline 5. Conclusion and Future Work Polyline simplification plays an important factor in visualization applications. The reduction in number of points can help spatial analysis programs improve their performances. The Douglas-Peucker algorithm developed in 1973 is considered an effective solution with the complexity of O(mn) [2], and its enhanced version [4] in 1992 has the complexity of O(nlogn) (m – number of segments, n – number of vertices). In this paper, a massively parallel polyline simplification algorithm has been introduced. This algorithm has a parallel complexity of O(n) using n processing elements. This speedup is achieved by using a massive number of processing elements and the associative global operations of the Multipleinstruction-stream Associative Computing model (MASC). This algorithm requires only one instruction stream and n processing elements. Therefore, the time complexity linearly depends on the number of vertices. The result of our algorithm as compared to other algorithms is summarized in Figure 9. Algorithm Sequential DouglasPeucker Enhanced DouglasPeucker MASC Worst-case Complexity O(n2) O(mn) Number of processors 1 1 Cost O(n2) O(mn) O(nlogn) 1 O(nlogn) O(n) n O(n2) Figure 9: Sequential algorithms vs. ASC algorithm Although this algorithm requires a massive number of processing elements, its parallel computational cost is the same as the DouglasPeucker algorithm when the number of line segments is high. This research provides an effective solution to this problem. The future work for this research includes an implementation of the MASC polyline simplification algorithm using the Chapel parallel programming language. Chapel is the choice of language because of its language features in data parallelism, task parallelism, concurrency and nested parallelism via high-level abstractions [1] that closely matches the MASC model. Furthermore, this algorithm will be implemented using CUDA, which is a hardware platform which can also support the MASC model. The future work of this research also includes an analysis of the average case and best case complexities. The results then can be compared with the enhanced Douglas-Peucker algorithm complexity [4]. References [1] B. L. Chamberlain, D. Callahan, H. P. Zima, “Parallel Programmability and the Chapel Language”, Int‟l Journal of High Performance Computing Applications, 21(3), pp. 291-312, August 2007. [2] D. Douglas, T. Peucker, “Algorithms for the Reduction of the Number of Points Required to Represent a Digitized Line or Its Caricature”, Canada Cartographer, Univ. of Toronto Press, 10(2), pp. 112-122, December 1973. [3] G. Fox, “What Have We Learnt from Using Real Parallel Machines to Solve Real Problems”, Proc. of the 3rd Conf. on Hypercube Conccurrent Computers and Applications, vol. 2, ACM Press, pp. 897-955, 1988. [4] J.Hershberger, J. Snoeyink, “Speeding Up the DouglasPeucker line-simplification algorithm”, Proc. of the 5th Symp. on Data Handlings, pp. 134-143, 1992. [5] G.F. Jenks, “Lines, Computers and Human Frailties”, Annuals of the Association of American Geographers, pp. 1-10, 1981. [6] M. Jin, J. Baker, “Two Graph Algorithms On an Associative Computing Model”, Proc. of the Int‟l Conf. on Parallel and Distributed Processing Techniques (PDPTA), Las Vegas, June 2007. [7] M. Jin, J. Baker, and K. Batcher, “Timing of Associative Operations on the MASC model”, Proc. of the 15th IPDPS (Workshop in Massively Parallel Processing), CD-ROM, April 2011. [8] J. Potter, “Associative Computing: A Programming Paradigm for Massively Parallel Computers”, Plenum Press, New York, NY, 1992. [9] J. Potter, J. Baker, S. Scott, A. Bansal, C. Leangsuksun, and C. Asthgiri, “ASC: An Associative-Computing Paradigm”, Computer, 27(11), 19-25, 1994. [10] K. Reumann, A.P.M Witkam, “Optimizing curve segmentation in computer graphics”, Proc. of the Int‟l Computing Symposium, pp. 467-472, 1974. [11] M. Scherger, “On Using the UML to Describe the MASC Model of Parallel Computation”, Proc. of the Int‟ Conf. on PDPTA, pp. 2639-2645, Las Vegas, June 2000. [12] M. Visvalingam, J.D. Whyatt, “Line Generalization by Repeated Elimination of Points”, Cartographic Journal, 30(1):46-52, 1993. [13] S.T. Wu, R.G. Marquez, “A Non-self-intersection Douglas-Peucker Algorithm”, Computer Graphics and Image Processing, pp. 60-66, Oct 2003.
© Copyright 2026 Paperzz