heap - Dave Reed

CSC 427: Data Structures and Algorithm Analysis
Fall 2004
Heaps and heap sort
 complete tree, heap property
 min-heaps & max-heaps
 heap operations: insert, remove min/max
 heap implementation
 heap sort
1
Tree balancing
as we saw last time, specialize binary tree structures & algorithms can
ensure O(log N) tree height  O(log N) cost operations
 e.g., an AVL tree ensures height < 2 log(N) + 2
of course, the IDEAL would be to maintain minimal height
a complete tree is a tree in which
 all leaves are on the same level or else on 2 adjacent levels
 all leaves at the lowest level are as far left as possible
 a complete tree will have minimal depth
2
Heaps
a heap is complete binary tree in which
 for every node, the value stored is ≥ the value stored in either subtree
technically, this is the definition of a max-heap, where root is max value in heap
can also define min-heap, where root is min value in heap
since complete, a heap has minimal height
 can insert in O(height) = O(log N)
 searching is O(N)  heaps are not good for general storage
 however, heaps are perfect for implementing priority queues
can access max value in O(1), remove max value in O(height) = O(log N)
3
Inserting into a heap
to insert into a heap
 place new item in next open leaf position
 if new value is bigger than parent, then swap nodes
 continue up toward the root, swapping with parent, until bigger parent found
see http://www.cs.oberlin.edu/classes/dragn/labs/heaps/heaps5.html
note: insertion maintains completeness and the heap property
 worst case, if add largest value, will have to swap all the way up to the root
 but only nodes on the path are swapped  O(height) = O(log N) swaps
4
Removing root of a heap
to remove the max value (root) of a heap
 replace root with last node on bottom level (note if left or right subtree)
 if new root value is less than either child, swap with larger child
 continue down toward the leaves, swapping with largest child, until largest
see http://www.cs.oberlin.edu/classes/dragn/labs/heaps/heaps5.html
note: removing root maintains completeness and the heap property
 worst case, if last value is smallest, will have to swap all the way down to leaf
 but only nodes on the path are swapped  O(height) = O(log N) swaps
5
Implementing a heap
a heap provides for O(log N) insertion and remove max
 but so do AVL trees and other balanced binary search tree variant
 heaps also have a simple, vector-based implementation
 since there are no holes in a heap, can store nodes in a vector, level-by-level
 root is at index 0
5
v[0]
1
3
6
9
4
v[5]
v[6]
8
0
v[7]
2
v[4]
v[3]
7
 last leaf is at index v.size()-1
v[2]
v[1]
v[8]
 for a node at index i, children are at
2*i+1 and 2*i+2
v[9]
 to add at next available leaf, simply
5
1
3
9
6
2
4
7
0
8
push_back
6
Heap class
template <class Comparable>
class Heap
{
public:
Heap() { }
void push(const Comparable & newItem) { /* LATER SLIDE */ }
void pop() { /* LATER SLIDE */ }
Comparable top()
{
return items[0];
}
int size()
{
return items.size();
}
private:
vector<Comparable> items;
we can define a
templated Heap
class to
encapsulate heap
operations
could then be
used whenever a
priority queue is
needed
void swapItems(int index1, int index2)
{
Comparable temp = items[index1];
items[index1] = items[index2];
items[index2] = temp;
}
};
7
push method
push works by
 adding the new item at the next available leaf (i.e., pushes onto items vector)
 follows path back toward root, swapping if out of order
recall: position of parent node in vector is (currenPos-1)/2
void push(const Comparable & newItem)
{
items.push_back(newItem);
int currentPos = items.size()-1, parentPos = (currentPos-1)/2;
while (parentPos >= 0) {
if (items[currentPos] > items[parentPos]) {
swapItems(currentPos, parentPos);
currentPos = parentPos;
parentPos = (currentPos-1)/2;
}
else {
break;
}
}
}
8
pop method
pop works by
 replace root with value at last leaf (and pop from back of items)
 follows path down from root, swapping with largest child if out of order
recall: position of child nodes in vector are 2*currentPos+1 and 2*currentPos+2
void pop()
{
items[0] = items[items.size()-1];
items.pop_back();
int currentPos = 0, childPos = 1;
while (childPos < items.size()) {
if (childPos < items.size()-1 && items[childPos] < items[childPos+1]) {
childPos++;
}
if (items[currentPos] < items[childPos]) {
swapItems(currentPos, childPos);
currentPos = childPos;
childPos = 2*currentPos + 1;
}
else {
break;
}
}
}
9
Heap sort
the priority queue nature of heaps suggests an efficient sorting algorithm
 start with the vector to be sorted
 construct a heap out of the vector elements
 repeatedly, remove max element and put back into the vector
template <class Comparable>
void HeapSort(vector<Comparable> & items)
{
Heap<int> itemHeap;
for (int i = 0; i < items.size(); i++) {
itemHeap.push(items[i]);
}
for (int i = items.size()-1; i >= 0; i--) {
items[i] = itemHeap.top();
itemHeap.pop();
}
}
 N items in vector, each insertion
can require O(log N) swaps to
reheapify
construct heap in O(N log N)
 N items in heap, each removal can
require O(log N) swap to reheapify
copy back in O(N log N)
thus, overall efficiency is O(N log N), which is as good as it gets!
 can also implement so that the sorting is done in place, requires no extra storage
10
Tuesday: TEST 2
SIMILAR TO TEST 1, will contain a mixture of question types
 quick-and-dirty, factual knowledge
e.g., TRUE/FALSE, multiple choice
 conceptual understanding
e.g., short answer, explain code
 practical knowledge & programming skills
trace/analyze/modify/augment code
cumulative, but will emphasize material since the last test
study advice:




review lecture notes (if not mentioned in notes, will not be on test)
read text to augment conceptual understanding, see more examples
review quizzes and homeworks
review TEST 1 for question formats
 feel free to review other sources (lots of C++/algorithms tutorials online)
11