Merge Sort - Department of Computer Sciences

Merge Sort
Roberto Hibbler
Dept. of Computer Science
Florida Institute of Technology
Melbourne, FL 32901
[email protected]
ABSTRACT
Given an array of elements, we want to arrange those elements
into a sorted order. To sort those elements, we will need to make
comparisons between the individual elements efficiently. Merge
Sort uses a divide and conquer strategy to sort an array efficiently
while making the least number of comparisons between array
elements. Our results show that for arrays with large numbers of
array elements, Merge Sort is more efficient than three other
comparison sort algorithms, Bubble Sort[1], Insertion Sort[3], and
Selection Sort[2]. Our theoretical evaluation shows that Merge
Sort beats a quadratic time complexity, while our empirical
evaluation shows that on average Merge Sort is 32 times faster
than Insertion Sort[3], the current recognized most efficient
comparison algorithm, with ten different data sets.
Keywords
Merge Sort, sorting, comparisons, Selection Sort[2], arrange
1. INTRODUCTION
The ability to arrange an array of elements into a defined order is
very important in Computer Science. Sorting is heavily used with
online stores, were the order that services or items were purchased
determines what orders can be filled and who receives their order
first. Sorting is also essential for the database management
systems used by banks and financial systems, such as the New
Stock Exchange, to track and rank the billions of transactions that
go on in one day. There are many algorithms, which provide a
solution to sorting arrays, including algorithms such as Bubble
Sort[1], Insertion Sort[3], and Selection Sort[2]. While these
algorithms are programmatically correct, they are not efficient for
arrays with a large number of elements and exhibit quadratic time
complexity.
We are given an array of comparable values. We need to arrange
these values into either an ascending or descending order.
We introduce the Merge Sort algorithm. The Merge Sort
algorithm is a divide-and-conquer algorithm. It takes input of an
array and divides that array into sub arrays of single elements. A
single element is already sorted, and so the elements are sorted
back into sorted arrays two sub-arrays at a time, until we are left
with a final sorted array. We contribute the following:
1.
2.
3.
We introduce the Merge Sort algorithm.
We show that theoretically Merge Sort has a worst-case
time complexity better than ܱሺ݊ଶ ሻ.
We show that empirically Merge Sort is faster than
Selection Sort[2] over ten data sets.
This paper will discuss in Section 2 comparison sort algorithms
related to the problem, followed by the detailed approach of our
solution in Section 3, the evaluation of our results in Section 4,
and our final conclusion in Section 5.
2. RELATED WORK
The three algorithms that we will discuss are Bubble Sort[1] ,
Selection Sort[2], and Insertion Sort[3]. All three are comparison
sort algorithms, just as Merge Sort.
The Bubble Sort[1] algorithm works by continually swapping
adjacent array elements if they are out of order until the array is in
sorted order. Every iteration through the array places at least one
element at its correct position. Although algorithmically correct,
Bubble Sort[1] is inefficient for use with arrays with a large
number of array elements and has a ܱሺ݊ଶ ሻ time complexity.
Knuth observed, also, that while Bubble Sort[1] shares the worstcase time complexity with other prevalent sorting algorithms,
compared to them it makes far more element swaps, resulting in
poor interaction with modern CPU hardware. We intend to show
that Merge Sort needs to make on average fewer element swaps
than Bubble Sort[1] .
The Selection Sort[2]algorithm arranges array elements in order
by first finding the minimum value in the array and swapping it
with the array element that is in its correct position depending on
how the array is being arranged. The process is then repeated with
the second smallest value until the array is sorted. This creates
two distinctive regions within the array, the half that is sorted and
the half that has not been sorted. Selection Sort[2]shows an
improvement over Bubble Sort[1] by not comparing all the
elements in its unsorted half until it is time for that element to be
placed into its sorted position. This makes Selection Sort[2]less
affected by the input’s order. Though, it is still no less inefficient
with arrays with a large number of array elements. Also, even
with the improvements Selection Sort[2]still shares the same
worst-case time complexity of ܱሺ݊ଶ ሻ. We intend to show that
Merge Sort will operate at a worst-case time complexity faster
than ܱሺ݊ଶ ሻ.
The Insertion Sort[3]algorithm takes elements from the input
array and places those elements in their correct place into a new
array, shifting existing array elements as needed. Insertion
Sort[3]improves over Selection Sort[2]by only making as many
comparisons as it needs to determine the correct position of the
current element, while Selection Sort[2]makes comparisons
against each element in the unsorted part of the array. In the
௡మ
average case, Insertion Sort[3]’s time complexity is ܱሺ ሻ, but its
ସ
worst case is ܱሺ݊ଶ ሻ, the same as Bubble Sort[1] and Selection
Sort[2]. The tradeoff of Insertion Sort[3]is that on the average
more elements are swapped as array elements are shifted within
the array with the addition of new elements. We intend to show
that Merge Sort operates at an average case time complexity faster
than ܱሺ݊ଶ ሻ.
(38 27 43 3 9 82 10 1)
Output – array A in ascending order
3. APPROACH
A large array with an arbitrary order needs to be arranged in an
ascending or descending order, either lexicographically or
numerically. Merge sort can solve this problem by using two key
ideas.
The first key idea of merge sort is that a problem can be divided
and conquered. The problem can be broken into smaller arrays,
and those arrays can be solved. Second, by dividing the array into
halves, then dividing those halves by recursively halving them
into arrays of single elements, two sorted arrays are merged into
one array, as a single element array is already sorted. Refer to the
following pseudocode:
Figure 1: Shows the splitting of the input array into single
element arrays.
Input – A: array of n elements
Output – array A sorted in ascending order
1.
proc mergesort(A: array)
2.
var array left, right, result
3.
if length(A)<=1
4.
return(A)
5.
var middle=length(A)/2
6.
for each x in A up to middle
7.
add x to left
8.
for each x in A after middle
9.
add x to right
10. left=mergesort(left)
11. right=mergesort(right)
12. result=merge(left,right)
13. return result
Input – left:array of m elements, right: array of
k elements
Output – array result sorted in ascending order
14. proc merge(left: array, right: array)
15.
var array result
16.
which length(left) > 0 and length(right) > 0
17.
if first(left) <= first(right)
18.
append first(left) to result
19.
left=rest(left)
20.
else
21.
append first(right) to result
22.
right=rest(right)
23.
end while
24.
if length(left) > 0
25.
append left to result
26.
if length(right) > 0
27.
append right to result
28.
return result
As the pseudocode shows, after the array is broken up into a left
half and a right half (lines 5 - 9), the two halves are divided
recursively (lines 10 – 11) until they are all within a single
element array. Then, the two halves’ elements are compared to
determine how the two arrays should be arranged (lines 16 -22).
Should any one half contain elements not added to the sorted
array after the comparisons are made, the remainder is added so
no elements are lost (lines 24 – 27). In the following examples,
using the given input, the division of the array (Figure 1) and how
the array is merged back into a sorted array (Figure 2) are
illustrated.
Inputs – A: array of n elements
Figure 2: Shows the merging of the single element arrays
during the Merge Step.
As the example shows, array A is broken in half continuously
until they are in arrays of only a single element, then those single
elements are merged together until they form a single sorted array
in ascending order.
4. EVALUATION
4.1 Theoretical Analysis
4.1.1 Evaluation Criteria
All comparison based sorting algorithms count the comparisons of
array elements as one of their key operations. The Merge Sort
algorithm can be evaluated by measuring the number of
comparisons between array elements. As the key operation, we
can measure the number of comparisons made to determine the
overall efficiency of the algorithm. We intend to show that
because the Merge Sort algorithm makes less comparisons over
the currently acknowledged most efficient algorithm, Insertion
Sort[3], Merge Sort is the most efficient comparison sort
algorithm.
4.1.1.1 Merge Sort Case Scenarios
4.1.1.1.1 Worst Case
Merge Sort makes the element comparisons we want to measure
during the merge step, where pairs of arrays are recursively
merged into a single array. Merge Sort’s worst case, depicted in
Figure 3, is the scenario where during each recursive call of the
merge step, the two largest elements are located in different
arrays. This forces the maximum number of comparisons to occur.
In this case, the Merge Sort algorithm’s efficiency can be
represented by the number of comparisons made during each
recursive call of the merge step, which is described in the
following recurrence equation where variable n is denoted as the
array size and T(n) refers to the total comparisons in the merge
step:
௡
ܶሺ݊ሻ = 2ܶ ቀ ቁ + ݊ − 1
ଶ
ܶሺ1ሻ = 0
(1)
(2)
Equation (1) gives the total number of comparisons that occur in
Merge Sort dependent on the number of array elements. The
2T(n/2) refers to the comparisons made to the two halves of the
array before the arrays are merged. The n-1 refers to the total
comparisons in the merge step. Equation (2) states the base case,
which is no comparisons are needed for a single element array.
With these two equations, we can determine the total number of
comparisons by looking at each recursive call of the merge step.
We will next solve equation (1) by expanding it to isolating n and
perform substitution.
௡
௡
ܶሺ݊ሻ = 2 ቂ2ܶ ቀ ቁ + − 1ቃ + ݊ − 1
(3)
ܶሺ݊ሻ = 4ܶ ቀ ቁ + ݊ − 2 + ݊ − 1
(4)
ܶሺ݊ሻ = 4 ቂ2ܶ ቀ ቁ + − 1ቃ + ݊ − 2 + ݊ − 1
(5)
ܶሺ݊ሻ = 8ܶ ቀ ቁ + ݊ − 4 + ݊ − 2 + ݊ − 1
(6)
ସ
௡
ସ
௡
଼
௡
ଶ
௡
ସ
଼
By expanding equation (1) to get equations (3), (4), and (5) we
can discern a pattern. By using the variable k to indicate the depth
of the recursion, we get the following equation:
௡
ܶሺ݊ሻ = 2௞ ܶ ቀ ೖቁ + ݇݊ − ሺ2௞ − 1ሻ
ଶ
(7)
We can solve equation (7) by using the base case in equation (2)
and determine the value of k, which refers to the depth of
recursion.
2௞ = ݊
݇ = log ଶ ݊
ܶሺ݊ሻ = ݊ ∗ 0 + ݊ log ଶ ݊ − ݊ + 1
(8)
(9)
(10)
By making the statement in equation (8), equation (7) and
equation (2) are equal. We can then solve equation (8) for k, and
get equation (9). Equation (9) can then be used to reduce equation
(7) to equation (10), which represents the total number of
comparisons of array elements in the merge step This results in a
Big O time complexity of ܱሺ݊ log ݊ሻ overall.
4.1.1.2 Best Case
The best case of Merge Sort, depicted in Figure 3, occurs when
the largest element of one array is smaller than any element in the
other. In this scenario, only ݊ൗ2 comparisons of array elements
are made. Using the same process that we used to determine the
total number of comparisons for the worst case, we get the
following equations:
ܶሺ݊ሻ = 2ܶ ቀ ቁ +
௡
௡
ܶሺ݊ሻ =
+
ଶ
௡
2௞ ܶ ቀ ೖ ቁ
ଶ
௡
ଶ
(11)
௞௡
ଶ
ܶ݊ሻ = ݊ ∗ 0 + log ଶ ݊
ଶ
(12)
(13)
Similarly to earlier, equation (11) can be expanded to find a
pattern; equation (12) can then be created by substituting k, and
by solving for k get equation (13), which is the total number of
comparisons for the best case of Merge sort This also results in a
Big O time complexity of ܱሺ݊ log ݊ሻ, just like the worst case.
4.1.2 Insertion Sort[3] Case Scenarios
4.1.2.1 Worst Case
Insertion Sort[3] builds the sorted array one element at a time,
removing one element from the input data and placing it in its
correct location each iteration until the array is in sorted order.
The worst case for Insertion Sort[3] is if the input array is in
reverse from sorted order. In this case every iteration of the inner
loop will scan and shift the entire sorted region of the array before
inserting the next element. Denoting n as the number of array
elements, the number of comparisons can be described by the
following equation:
ܶሺ݊ሻ = ሺ݊ − 1ሻ + ሺ݊ − 2ሻ + . . . + 1
(14)
This results in a Big O time complexity of ܱሺ݊ଶ ሻ in the worst
case.
4.1.2.2 Best Case
The best case for Insertion Sort[3] is when the input array is
already sorted. In this scenario, one element is removed from the
input array and placed into the sorted array without the need of
shifting elements. This results in a Big O time complexity of ܱሺ݊ሻ
in the best case.
4.1.3 Analysis
Comparing the time complexities of Merge Sort and Insertion
Sort[3], Insertion Sort[3] beats Merge Sort in the best case, as
Merge Sort has a Big-O time complexity of ܱሺ݊ log ݊ሻ while
Insertion Sort[3] is ܱሺ݊ሻ. However, taking the worst cases, Merge
Sort is faster than Insertion Sort[3] with a time complexity of still
ܱሺ݊ log ݊ሻ over Insertion Sort[3]’s which is equivalent to ܱሺ݊ଶ ሻ.
This adds to the theory that Merge Sort will overall be a better
algorithm over Insertion Sort[3] with less than optimum input.
Figure 4: Depicts the slower
ower speed of Insertion
In
Sort vs Merge
Sort with array sizes greater than 1000.
1000
Figure 3: Depicts how single element
lement arrays are merged
together during the Merge step in the best and worst case
4.2 Empirical Analysis
4.2.1 Evaluation Criteria
The goal of this evaluation is to demonstrate the improved
efficiency of the Merge Sort algorithm based on the execution’s
CPU runtime.
4.2.2 Evaluation Procedures
The efficiency of Merge Sort can be measured by determining the
CPU runtime of an implementation of the algorithm to sort a
number of elements versus the CPU runtime it takes an
implementation of the Insertion Sort[3] algorithm. The dataset
used is a server log of connecting IP Addresses over the course of
one day. All the IP Addresses are first extracted from the log, in a
separate process, and placed within a file. Subsets of the IP
Addresses are then sorted using both the Merge Sort algorithm
and the Insertion Sort[3] algorithm with the following array sizes:
5, 10, 15, 20, 30, 50, 100, 1000, 5000, 10000, 15000, and 20000.
Each array size was run ten times and the average of the CPU
runtimes was taken. Both algorithms take as its parameter an array
of IP Addresses. For reproducibility, the dataset used for the
evaluation
can
be
found
at
“http://cs.fit.edu/~pkc/pub/classes/writing/httpdJan24.log.zip
http://cs.fit.edu/~pkc/pub/classes/writing/httpdJan24.log.zip”.
The original dataset has over thirty thousand records. For the
purposes of this experiment, only twenty thousand were used. The
tests were run on a PC running Microsoft Windows XP with the
following specifications: Intel Core 2 Duo CPU E8400 at 3.00
GHz with 3 GB of RAM. The algorithms were implemented in
Java using the Java 6 API.
4.2.2.1 Results and Analysis
Compared to the Insertion Sort[3] algorithm, the Merge Sort
algorithm shows faster CPU runtimes for array sizes over 1000. A
summary of the results is contained in Figure 4 and Figure 55.
Figure 5: Depicts thee faster speed of
o Insertion Sort until
around array sizes of 1000.
With Merge Sort being red and Insertion Sort being blue, Figure 5
shows the relative execution speed curves of the two algorithms.
It can be seen that for an array sizes under 1000, the Insertion
Sort[3] algorithm has much faster execution times, showing a
clear advantage. This is probably due to the overhead incurred
during the creation and deleting of new arrays during splitting.
Around array sizes of 1000, the algorithm’s curves
curv intersect, and
Merge Sort begins to show an advantage over Insertion Sort[3],
which can be seen in Figure 4.. As the size of the array grows
larger, Insertion Sort[3]’s
’s execution runtime increases at a faster
pace than Merge Sort. This shows that Insertion Sort[3] will
progressively get
et worse than Merge Sort as you increase the array
size, while Merge Sort’s efficiency will decrease at a slower rate.
It can be concluded from the results that for array sizes over 1000
Insertion Sort[3] will be unsuitable compared to the efficiency
presented
nted when using the Merge Sort algorithm.
5. CONCLUSION
The purpose of this paper was to introduce the Merge Sort
algorithm and show its improvements over its predecessors. Our
theoretical analysis shows that compared to the Bubble Sort[1],
Insertion Sort[3], and Selection Sort[2] algorithms, Merge Sort
has a faster Big O worst-case time complexity of ܱሺ݊ log ݊ሻ. Or
empirical analysis shows that compared to Insertion Sort[3],
Merge Sort is 32 times faster for arrays larger than 1000 elements.
This makes Merge Sort far more efficient than Insertion Sort for
array sizes larger than 1000.
Although Merge Sort has been shown to be better in the worst
case for array sizes larger than 1000, it was still slower than
Insertion Sort with array sizes less than 1000. This can be
explained by the overhead required for the creation and merging
of all the arrays during the merge step. A future improvement to
the Merge Sort algorithm would be to determine a way to reduce
this overhead.
6. REFERENCES
[1] Knuth, D. Sorting by Exchanging. “The Art of Computer
Programming, Volume 3: Sorting and Searching, Second
Edition.” 1997. Pp.106-110
[2] Knuth, D. Sorting by Selection. “The Art of Computer
Programming, Volume 3: Sorting and Searching, Second
Edition.” 1997. Pp.138-141
[3] Knuth, D. Sorting by Insertion. “The Art of Computer
Programming, Volume 3: Sorting and Searching, Second
Edition.” 1998. Pp.80-105