A Massively Parallel Algorithm for Polyline

A Massively Parallel Algorithm for Polyline
Simplification Using an Associative Computing Model
Huy Tran
Department of Computing Sciences
Texas A&M University Corpus Christi
Email: [email protected]
Michael Scherger
Department of Computing Sciences
Texas A&M University Corpus Christi
Email: [email protected]
Abstract - Line simplification is a process of
reducing the number of line segments to represent a
polyline. This reduction in the number of line
segments and vertices can improve the performance
of spatial analysis applications. The classic DouglasPeucker algorithm developed in 1973 has a
complexity of O(mn), where n denotes the number of
vertices and m the number of line segments. Its
enhanced version proposed in 1992 has a complexity
of O(nlogn). In this paper, we present a parallel line
simplification algorithm using a parallel Multipleinstruction-stream Associative Computing model
(MASC). Using one instruction stream of the MASC
model, our algorithm has a parallel complexity of
O(n) in the worst case using n processing elements.
remaining set of line segments to reconstruct the
polyline (O(n2) searches).
Keywords:
Parallel
algorithms,
associative
computing, SIMD algorithms, line simplification,
vertex elimination, level curve
1. Introduction
2D planar level curves are the polylines
where mathematical functions take on constant values.
In applications such as AutoCAD or MATLAB, these
planar level curves are represented as collections of
line segments. An example of a level curve in
AutoCAD is shown in Figure 1. The number of
digitized line segments collected is far more than
necessary [2]. Due to the complexity of the geospatial
functions, the number of line segments to represent the
planar level curve can be very large, which may cause
inefficiencies in visual performance. Therefore, the
polyline needs to be represented with fewer segments
and vertices. It is necessary to perform a polyline
simplification algorithm on a 2D planar level curve.
If the line segments were acquired in a
stream order, then the end vertex of one line segment
is the beginning vertex of the next line segment in the
file. It would be straightforward to apply a polyline
simplification algorithm. However, in this problem,
the line segments of polylines are digitized in a raster
scan order. The raster scan ordering of the line
segments requires intensive searching on the
Figure 1: An example of a level curve.
The Douglas-Peucker line simplification
algorithm is considered an effective line
simplification algorithm [2, 13]. The algorithm uses
the closeness of a vertex to a segment as a rejection
condition. Its worst-case complexity is O(mn), where
n denotes the number of vertices and m the number of
segments. Furthermore, in 1992 Hershberger and
Snoeyink introduced an improvement for DouglasPeucker algorithm to gain an enhanced O(nlogn) time
complexity [4]. The speeding up is achieved by
applying binary searches to maintain the path hulls of
subchains. Different approaches to this issue have
also been discussed in [5, 10, and 12]. However, even
the worst-case complexities O(mn) and O(nlogn) are
considered computationally expensive when it comes
to work with significantly big visualizations.
In this paper, we present a polyline
simplification algorithm using the Multipleinstruction-stream Associative Computing model
(MASC) [6, 8] to reduce the number of vertices
required to represent polylines. MASC is an
enhanced SIMD model with associative properties.
By using the constant global operations of the MASC
model, our algorithm has a parallel complexity of
linear time O(n) in the worst case.
This paper is organized as follows. Section 2
will discuss the polyline simplification in more
details and provide a sequential algorithm which our
parallel algorithm is based upon. Section 3 will
introduce the MASC model of computation and its
properties. Section 4 will describe our massively
parallel line simplification algorithm using the
MASC model. Finally, section 5 will provide the
discussions on future work and conclusion.
2. Polyline Simplification and a Sequential
Algorithm
Consider the line segments shown in Figure
2. Two connected segments, for example AB and
CD, can have one of the four following
arrangements:




as shown in Figure 3, five line segments have been
digitized. After rearrangement, the stream order
would be: B2 A2 A1 B1 B3 A3 A4 B4 A5 B5. Then,
each vertex will be checked with its next vertex for
coincidence and eliminated accordingly. In this
example, points A1, B1, A3, B4 will be eliminated
due to coincidence, and points A4, B3 will be deleted
due to collinearity.
B4 4 A4
5
B5
A5
A3
3
B1 1
B3
A2
A1
2
B2
A is coincident or near-coincident to C.
A is coincident or near-coincident to D.
B is coincident or near-coincident to C.
B is coincident or near-coincident to D.
The coincidence or collinearity of vertices is
determined by investigating their coordinates (x, y).
Two near-coincident vertices are considered
coincident if their distance is less than or equal to an
accepted tolerance α.
Actual representation with point B ≈ D
α
(α: accepted tolerance)
Digitized order: A B … C D
(D is not next to B in the record file)
Figure 2: An example of how two connected
segments are not in a stream order
2.1. A specific problem
In order to perform the polyline
simplification, the digitized line segments in the file
need to be re-arranged. The random nature of the
digitized line segments necessitates a massive
number of search operations to determine coincident
points. This is illustrated by the following sequential
algorithm to re-arrange these line segments into a
stream where line segments having coincident
vertices are moved closed to each other. For example,
A1(0, 3) B1(-1, 3)
A2(0, 3) B2(1, 2)
A3(-3, 3) B3(-1, 3)
A4(-3, 3) B4(-4,3)
A5(-4, 3) B5(-5, 2)
Segments_Example
Figure 3: An example of five segments with
integer coordinates
2.2. A sequential algorithm
An algorithm to simplify line segments is
constructing polylines from coincident and collinear
vertices. This can be obtained by eliminating vertices
whose distances to the prior initial vertex are less
than a maximum accepted tolerance α. The vertices
having further distance to the initial vertex (> α)
could be considered as part of a different polyline.
However, finding the coincident and collinear
vertices is expensive in this problem.
We call segArray the array of line segments.
The sequential algorithm is the following:
Begin
1. Set pointer current to the first segment in
segArray (current=0)
2. While current does not reach the end of
segArray
2.1. Set pointer next to the next segment of
current segment (next=current+1)
2.2. While next does not reach the end of
segArray
a. Check if the segment in next has
coincident vertices with current segment
(the four arrangements discussed in
section 2.1)
b. If yes, invert next segment if needed (in
case of the second and forth
arrangements in section 2.1)
c. Move the next segment closed to the
current segment in the array
d. Move pointer current to the next
segment (current+=1)
e. Repeat step 2.2
2.3. Move pointer current to the next segment
of current segment (current+=1)
2.4. Repeat step 2
End
The sequential algorithm above is to rearrange line segments into a stream order. The
mechanism is similar to the selection sort. The
algorithm requires searching all line segments for
every investigated line segment to look for the line
segment having coincident vertex and move it to the
right place. This ineffective searching and sorting
can be noticed by the usage of two while loops in
step 2 and 2.2. Consequently, the complexity of this
algorithm is O(n2), where n is the number of
vertices.
Memory
PE
Memory
PE
Memory
PE
T_route
PE
Memory
PE
T_local
Instruction
Stream
...
T_I/O
Memory
Instruction
Stream
Instruction
Stream
Instruction Stream Interconnection Network
PE
Broadcast / Reduction Network
Memory
...
Cell Interconnection Network
3. The MASC model
The following is a description of the Multiple
Associative Computing (MASC) model of parallel
computation. As shown in Figure 4, the MASC model
consists of an array of processor-memory pairs called
cells and an array of instruction streams.
T_bcast
T_reduce
T_sync
Figure 4: Conceptual view of MASC
A MASC machine with n cells and j
instruction streams is denoted as MASC(n, j). It is
expected that the number of instruction stream
processors be much less than the number of cells.
The model also includes three virtual networks:
1.
A cell network used for cell-to-cell
communication. This network is used for the
parallel movement of data between cells. This
network could be a linear array, mesh, hypercube,
or a dynamic interconnection network.
2.
A broadcast/reduction network used for
communication between an instruction stream
and a set of cells. This network is also capable
of performing common reduction operations.
3.
An instruction stream network used for interinstruction stream communication.
Cells can receive their next set of
instructions to execute from the instruction stream
broadcast network. Cells can be instructed from their
current instruction stream to send and receive
messages to other cells in the same partition using
some communication pattern via the cell network.
Each instruction stream processor is also connected
to two interconnection networks. An instruction
stream processor broadcasts instructions to the cells
using the instruction stream broadcast network. The
instruction streams also may need to communicate
and may do so using the instruction stream network.
Any of these networks may be virtual and be
simulated by whatever network is present.
MASC provides one or more instruction
streams. Each active instruction stream is assigned to
a unique dynamic partition of cells. This allows a
task that is being executed in a data parallel fashion
to be partitioned into two or more data parallel tasks
using control parallelism.
The multiple IS‟s
supported by the MASC model allows for greater
efficiency, flexibility, and re-configurability than is
possible with only one instruction stream. While
SIMD architectures can execute data parallel
programs very efficiently and normally can obtain
near linear speedup, data parallel programs in many
applications are not completely data parallel and
contain several non-trivial regions where significant
branching occurs [3]. In these parallel programming
regions, only a subset of traditional SIMD processors
can be active at the same time. With MASC, control
parallelism can be used to execute these different
branches simultaneously. Other MASC properties
include:
 The cells of the MASC model consist of a
processing element (PE) and local memory. The
accumulated memory of the MASC model
consists of an array of cells. There is no shared
memory between cells.

Each instruction stream is a processor with a bus
or broadcast/reduction network to all cells. Each
cell listens to only one instruction stream and
initially, all cells listen to the same instruction
stream.
The cells can switch to another
instruction stream in response to commands from
the current instruction stream.

An active cell executes the commands it receives
from its instruction stream, while an inactive cell
listens to but does not execute the command
from its instruction stream. Each instruction
stream has the ability to unconditionally activate
all cells listening to it.

Cells without further work are called idle cells
and are assigned to a specified instruction
stream, which among other tasks manages the
idle cells.

The average time for a cell to send a message
through the cell network to another cell is
characterized by the parameter troute. Each cell
also can read or write a word to an I/O channel.
The maximum time for a cell to execute a
command is given by the parameter tlocal.
The
time to perform a broadcast of either data or
instructions is given by the predictability parameter
tbcast. The time to perform a reduction operation is
given by the predictability parameter treduce. The
time for a cell to perform this I/O transfer is
characterized by the parameter ti/o. The time to
perform instruction stream synchronization is
characterized by the parameter tsynch.

An instruction stream can instruct its active cells
to perform an associative search in time tbcast +
tlocal + treduce.
Successful cells are called
responders, while unsuccessful cells are called
non-responders.

The instruction stream can activate either the set
of responders or the set of non-responders. It
can also restore the previous set of active cells in
tbcast + tlocal time.

Each instruction stream has the ability to select
an arbitrary responder from the set of active cells
in tbcast + tlocal time.

An active instruction stream can compute the
OR, AND, Greatest Lower Bound, or Least
Upper Bound of a set of values in all active cells
in treduce time.

An idle cell can be dynamically allocated to an
instruction stream in tsynch + tbcast time.
These predictability parameters were
identified using an object oriented description of the
MASC model in [11]. They were developed to
identify the performance costs using different
architecture classes of parallel computing equipment.
When the MASC model is implemented using a
traditional SIMD computer, it is highly deterministic
and the predictability costs can often be calculated
and are often “best possible” [7]. Many of the
predictability parameters for MASC operations
become fixed or operate in one step [7].
4. A MASC Line Simplification Algorithm
Realizing the inefficiency of searching and
sorting when coping with the problem as discussed
in section 2, we adopt global constant time
operations of the MASC model to avoid such
inefficiency.
Consider the simple example with five
segments having integer-coordinate vertices as shown
in Figure 3. In the example, coincident vertices have
the same value of coordinates, and three vertices are
called collinear if the triangle composed by them has
an area value of zero. This can be adjusted in the
functions to check coincidence and collinearity by
adding an accepted tolerance α.
The input data are described as in Figure 3 as
well. Every line of the input file is a line segment
consisting of two vertices. Each vertex has an xcoordinate and a y-coordinate. Therefore, at the first
state we can determine the vertex‟s left or right
neighbor, which is the other point in the same segment.
We use a tabular organization similar to the
one illustrated in Figure 5 as the data structure in our
algorithm. That is, the information about left and
right neighbors (left$ and right$) of the currently
investigated vertex and its coincident vertex (coin$ if any) are stored in each PE. Vertex A is called on
the left of vertex B if A‟s x-coordinate is less than
B‟s or if A‟s y-coordinate is less than B‟s when A
and B have the same x value. Vertex A is called on
the right of vertex B if A‟s x-coordinate is greater
than B‟s or if A‟s y-coordinate is greater than B‟s
when A and B have the same x value. In addition to
those location variables, two more variables are
defined: visited$ for tracking if the vertex has been
visited and delete$ for showing if the vertex should
be eliminated or not. Furthermore, every vertex is
assigned to one PE in the MASC model, which
results in a massive number of processing elements.
MASC_LINE_SIMPLIFICATION Algorithm
Begin
1. Set all PEs to active
2. Set del$ = „No‟, visited$ = „No‟
3. Set left$/right$ to the other vertex of the segment
4. For all PEs, repeat until no visited$ = „No‟
4.1. Find the coincident vertex
4.2. If there is no responder (no coincident
vertex)
4.2.1. Set visited$ = „Yes‟
4.3. Get the vertex from the responder (if any)
4.4. Set empty left$/right$ of the two coincident
vertices to its coincident vertex
4.5. Check if left$/right$ of the two coincident
vertices and themselves are collinear
4.5.1. If not:
a) Set the current PE‟s del$ = „Yes‟,
del$ = „No‟
b) Update field having the deleted
vertex as neighbor to its coincident
vertex (responders)
c) Set visited$ of both vertices = „Yes‟
d) Clear coin$ of both vertices
4.5.2. Else if they are collinear:
a) Set both vertices‟ del$ to „Yes‟
b) Set the current PE‟s visited$ = „Yes‟
c) Update fields that have the deleted
vertices (responders) as neighbor
i. If the deleted vertex is in left$,
update to left$ of the deleted
vertices
ii. Else if the deleted vertex is in
right$, update right$ of the
deleted vertices
d) Clear coin$ of both vertices
algorithm is O(n) in the worst case when there is no
coincidence between vertices.
PE
PE
PE
PE
PE
PE
PE
PE
PE
PE
left$
B1
right$
coin$
A1
B2
A2
B3
A3
B4
A4
B5
A5
visited$
No
No
No
No
No
No
No
No
No
No
del$
No
No
No
No
No
No
No
No
No
No
Figure 5: The initial table
PE
PE
PE
PE
PE
PE
PE
PE
PE
PE
vertex
A1
B1
A2
B2
A3
B3
A4
B4
A5
B5
left$
B1
A2
A2
right$
A2
A2
B2
coin$
B3
A3
B4
A4
B5
A5
visited$
Yes
No
Yes
No
No
No
No
No
No
No
del$
Yes
No
No
No
No
No
No
No
No
No
Figure 6: The table after one iteration
End
Using the MASC model, our algorithm does
not have to re-arrange the line segments because it
takes advantage of associative searching. The
operations “Find its coincident vertex” in step 4.1 and
“Find vertices that have it as neighbor” in step 4.5.1b
and 4.5.2c return values in constant time. After the
program finishes (all visited$ are „Yes‟), there would
be vertices whose del$ is „No‟. Those remaining
vertices belong to the simplified polylines of the level
curve‟s visual representation. The directions of
remaining vertices are maintained with their left$ and
right$ neighbors.
Figure 5 illustrates the initial table
representing the original digitized vertices. During
each iteration, a vertex is used in an associative
search for its coincident vertex. Then, it checks their
neighbors if they are collinear points. Appropriate
actions are executed to guarantee that after every
round of iteration, there is no deleted vertex in the
table, and all vertices will be visited after the
program finishes. Figure 6 demonstrates the table
after one iteration. The associative searching
capabilities of the MASC model helps each round of
iteration take constant time. Figure 7 shows the final
state of the table after all vertices are visited. Figure 8
is the resultant polyline constructed by fewer
segments and vertices. The running time of the
vertex
A1
B1
A2
B2
A3
B3
A4
B4
A5
B5
PE
PE
PE
PE
PE
PE
PE
PE
PE
PE
vertex
A1
B1
A2
B2
A3
B3
A4
B4
A5
B5
left$
B5
B5
A2
A2
B5
B5
B5
A5
B5
right$
A2
A2
B2
A2
A2
A2
A2
A2
A5
coin$
visited$
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
del$
Yes
Yes
No
No
Yes
Yes
Yes
Yes
No
No
Figure 7: The tables after all nodes are visited
Figure 8: The resultant polyline
5. Conclusion and Future Work
Polyline simplification plays an important
factor in visualization applications. The reduction in
number of points can help spatial analysis programs
improve their performances. The Douglas-Peucker
algorithm developed in 1973 is considered an
effective solution with the complexity of O(mn) [2],
and its enhanced version [4] in 1992 has the
complexity of O(nlogn) (m – number of segments, n
– number of vertices).
In this paper, a massively parallel polyline
simplification algorithm has been introduced. This
algorithm has a parallel complexity of O(n) using n
processing elements. This speedup is achieved by
using a massive number of processing elements and
the associative global operations of the Multipleinstruction-stream Associative Computing model
(MASC). This algorithm requires only one
instruction stream and n processing elements.
Therefore, the time complexity linearly depends on
the number of vertices. The result of our algorithm as
compared to other algorithms is summarized in
Figure 9.
Algorithm
Sequential
DouglasPeucker
Enhanced
DouglasPeucker
MASC
Worst-case
Complexity
O(n2)
O(mn)
Number of
processors
1
1
Cost
O(n2)
O(mn)
O(nlogn)
1
O(nlogn)
O(n)
n
O(n2)
Figure 9: Sequential algorithms vs. ASC algorithm
Although this algorithm requires a massive
number of processing elements, its parallel
computational cost is the same as the DouglasPeucker algorithm when the number of line segments
is high. This research provides an effective solution
to this problem.
The future work for this research includes an
implementation of the MASC polyline simplification
algorithm using the Chapel parallel programming
language. Chapel is the choice of language because
of its language features in data parallelism, task
parallelism, concurrency and nested parallelism via
high-level abstractions [1] that closely matches the
MASC model. Furthermore, this algorithm will be
implemented using CUDA, which is a hardware
platform which can also support the MASC model.
The future work of this research also
includes an analysis of the average case and best case
complexities. The results then can be compared with
the enhanced Douglas-Peucker algorithm complexity
[4].
References
[1] B. L. Chamberlain, D. Callahan, H. P. Zima, “Parallel
Programmability and the Chapel Language”, Int‟l Journal
of High Performance Computing Applications, 21(3), pp.
291-312, August 2007.
[2] D. Douglas, T. Peucker, “Algorithms for the Reduction
of the Number of Points Required to Represent a Digitized
Line or Its Caricature”, Canada Cartographer, Univ. of
Toronto Press, 10(2), pp. 112-122, December 1973.
[3] G. Fox, “What Have We Learnt from Using Real
Parallel Machines to Solve Real Problems”, Proc. of the 3rd
Conf. on Hypercube Conccurrent Computers and
Applications, vol. 2, ACM Press, pp. 897-955, 1988.
[4] J.Hershberger, J. Snoeyink, “Speeding Up the DouglasPeucker line-simplification algorithm”, Proc. of the 5th
Symp. on Data Handlings, pp. 134-143, 1992.
[5] G.F. Jenks, “Lines, Computers and Human Frailties”,
Annuals of the Association of American Geographers, pp.
1-10, 1981.
[6] M. Jin, J. Baker, “Two Graph Algorithms On an
Associative Computing Model”, Proc. of the Int‟l Conf. on
Parallel and Distributed Processing Techniques (PDPTA),
Las Vegas, June 2007.
[7] M. Jin, J. Baker, and K. Batcher, “Timing of
Associative Operations on the MASC model”, Proc. of the
15th IPDPS (Workshop in Massively Parallel Processing),
CD-ROM, April 2011.
[8] J. Potter, “Associative Computing: A Programming
Paradigm for Massively Parallel Computers”, Plenum Press,
New York, NY, 1992.
[9] J. Potter, J. Baker, S. Scott, A. Bansal, C. Leangsuksun,
and C. Asthgiri, “ASC: An Associative-Computing
Paradigm”, Computer, 27(11), 19-25, 1994.
[10] K. Reumann, A.P.M Witkam, “Optimizing curve
segmentation in computer graphics”, Proc. of the Int‟l
Computing Symposium, pp. 467-472, 1974.
[11] M. Scherger, “On Using the UML to Describe the
MASC Model of Parallel Computation”, Proc. of the Int‟
Conf. on PDPTA, pp. 2639-2645, Las Vegas, June 2000.
[12] M. Visvalingam, J.D. Whyatt, “Line Generalization by
Repeated Elimination of Points”, Cartographic Journal,
30(1):46-52, 1993.
[13] S.T. Wu, R.G. Marquez, “A Non-self-intersection
Douglas-Peucker Algorithm”, Computer Graphics and
Image Processing, pp. 60-66, Oct 2003.