Tuesday, November 14, 2006
“UNIX was never designed to keep
people from doing stupid things,
because that policy would also keep
them from doing clever things.”
- Doug Gwyn
1
Sorting Algorithms
Sorting fundamental part of computer
operations.
Most common based method of sorting is
comparison-based.
2
The basic operation of comparison-based
sorting is compare-exchange.
The lower bound on any sequential
comparison-based sort of n numbers is ?
3
The lower bound on any sequential
comparison-based sort of n numbers is
Θ(nlog n).
4
Sorting: Basics
What is a parallel sorted sequence? Where are
the input and output lists stored?
5
Sorting: Basics
Assumption that the input and output lists are
distributed.
Sorting can be intermediate step in another
algorithm
6
Sorting: Basics
The sorted output list is partitioned with the
property that each partitioned list is sorted and
each element in processor Pi's list is less than
that in Pj's list if i < j.
7
Sorting:
Parallel Compare Exchange Operation
One element per process.
ts+tw
Overall runtime dominated by inter-process communication
8
Sorting: Parallel Compare Split Operation
More than one element per process.
The time for a compare-split operation is ? (assuming that
the two partial lists were initially sorted).
9
Sorting: Parallel Compare Split Operation
More than one element per process.
The time for a compare-split operation is (ts+ twn/p)
(assuming that the two partial lists were initially sorted).
For larger block sizes time to merge two blocks is O(n/p)
10
Sorting Networks
Networks of comparators designed specifically for sorting.
11
Sorting Networks
We denote an increasing comparator by
and a decreasing comparator by Ө.
12
The speed of the network is proportional to ?
13
The speed of the network is proportional to its depth
14
Sorting Networks: Bitonic Sort
A bitonic sequence has two tones
Increasing - decreasing tone
Any cyclic rotation of such networks is also
considered bitonic. (i.e. a bitonic sequence that
becomes increasing-decreasing after shifting
its elements)
Is 1,2,4,7,6,0 a bitonic sequence?
15
Sorting Networks: Bitonic Sort
Is 8,9,2,1,0,4 a bitonic sequence?
16
Sorting Networks: Bitonic Sort
Is 8,9,2,1,0,4 a bitonic sequence?
Yes, it is a cyclic shift of 0,4,8,9,2,1.
17
Sorting Networks: Bitonic Sort
Let s = a0,a1,…,an-1 be a bitonic sequence
such that a0 ≤ a1 ≤ ··· ≤ an/2-1 and
an/2 ≥ an/2+1 ≥ ··· ≥ an-1.
Consider the following subsequences of s:
s1 = min{a0,an/2},min{a1,an/2+1},…,min{an/2-1,an-1}
s2 = max{a0,an/2},max{a1,an/2+1},…,max{an/2-1,an-1}
18
Sorting Networks: Bitonic Sort
s1 = min{a0,an/2},min{a1,an/2+1},…,min{an/2-1,an-1}
s2 = max{a0,an/2},max{a1,an/2+1},…,max{an/2-1,an-1}
In s1 there is an element bi such that all elements before
bi are from increasing part and all elements after bi are
from the decreasing part.
In s2 there is an element b’i such that all elements before
b’i are from decreasing part and all elements after b’i are
from the increasing part.
19
Sorting Networks: Bitonic Sort
s1 = min{a0,an/2},min{a1,an/2+1},…,min{an/2-1,an-1}
s2 = max{a0,an/2},max{a1,an/2+1},…,max{an/2-1,an-1}
s1 and s2 are both bitonic and each
element of s1 is less than every element in
s2.
20
Sorting Networks: Bitonic Sort
s1 = min{a0,an/2},min{a1,an/2+1},…,min{an/2-1,an-1}
s2 = max{a0,an/2},max{a1,an/2+1},…,max{an/2-1,an-1}
s1 and s2 are both bitonic and each element of s1
is less than every element in s2.
Divided a bigger problem into two smaller
problems (bitonic split).
We can apply the procedure recursively on s1
and s2 to get the sorted sequence.
21
Sorting Networks: Bitonic Sort
Bitonic merge:
The procedure of sorting a bitonic
sequence using bitonic splits.
22
Sorting Networks: Bitonic Sort
A bitonic merging network for n = 16. Note: input is a bitonic sequence.
A BM[16] bitonic merging network: The network takes a bitonic
23
sequence and outputs it in sorted order.
Number of splits required are?
24
Number of splits required are log(n).
There are log(n) columns in a bitonic
merge network.
25
How do we sort n unordered elements?
26
How do we sort n unordered elements
using bitonic merge?
We must first build a single bitonic
sequence from the given sequence.
27
A sequence of length 2 is a
bitonic sequence.
28
A bitonic sequence of length 4 can be
built by sorting the first two elements
using BM[2] and next two, using ӨBM[2].
This process can be repeated to generate
larger bitonic sequences.
29
Sorting Networks: Bitonic Sort
The last merging network (BM[16]) sorts the input. In
this example, n = 16.
30
Sorting Networks: Bitonic Sort
The comparator network that transforms an input sequence of 16
31
unordered numbers into a bitonic sequence.
Sorting Networks: Bitonic Sort
A bitonic merging network for n = 16. Note: input is a bitonic sequence.
A BM[16] bitonic merging network: The network takes a bitonic
32
sequence and outputs it in sorted order.
d(n) = d(n/2) +log(n)
logn
i (log
2
n log n) / 2
i 1
33
The depth of the network is Θ(log2 n).
A serial implementation of the network
would have complexity Θ(nlog2 n).
34
Bitonic algorithm is communication
intensive.
Take into account topology of underlying
network.
35
Bitonic sorting network for n elements:
log n stages
stage i consists of i columns of n/2
comparators.
On a parallel computer the compare
exchange operations is performed by a
pair of processes.
36
Sorting Networks: Bitonic Sort
A bitonic merging network for n = 16. Note: input is a bitonic sequence.
A BM[16] bitonic merging network: The network takes a bitonic
37
sequence and outputs it in sorted order.
One element per processor.
How to map processes?
Compare-exchange should ideally be between
neighboring processes.
38
Mapping Bitonic Sort to Hypercubes
The compare-exchange operation is
performed between two wires only if
their labels differ in exactly one bit!
This implies a direct mapping of wires to
processors. All communication is nearest
neighbor!
39
40
41
42
43
Mapping Bitonic Sort to Hypercubes
Communication characteristics of bitonic sort on a hypercube.
During each stage of the algorithm, processes communicate
44
along the dimensions shown.
Mapping Bitonic Sort to Hypercubes
Parallel formulation of bitonic sort on a hypercube with n = 2d
processes.
What is the parallel runtime?
45
Mapping Bitonic Sort to Hypercubes
Parallel formulation of bitonic sort on a hypercube with n = 2d
processes.
Parallel runtime: Tp=O(log2n)
46
Cost optimal?
Block of Elements Per Processor
Each process is assigned a block of n/p
elements.
47
Block of Elements Per Processor
The first step is a local sort of the local
block.
48
Block of Elements Per Processor: Hypercube
Initially the processes sort their n/p elements
(using merge sort) in time Θ((n/p)log(n/p)) and
then perform Θ(log2p) compare-split steps.
The parallel run time of this formulation is
Comparing to an optimal sort, the algorithm can
efficiently use up to p (2 logn ) processes.
49
© Copyright 2026 Paperzz