Chapter 14 Slides

Chapter 14
Advanced Trees
© 2006 Pearson Education Inc., Upper Saddle River, NJ. All rights reserved.
Overview
●
14.1
–
●
14.2
–
●
Trees to model a cluster of disjoint sets.
14.3
–
●
Heaps provide an efficient implementation of a new
kind of queue.
Digital search trees provide a new way to store sets
of Strings.
14.4
–
Red-black trees, a variation on binary search trees,
are guaranteed to remain balanced.
Heaps
●
Heap
–
A binary tree.
●
●
The value at each node is less than or equal to the
values at the children of that node.
The tree is perfect or close to perfect.
–
The requirement that a heap is “perfect or close to
perfect” lets us use a contiguous representation
somewhat analogous to the representation of
multidimensional arrays from Section 12.3
–
We use an ArrayList with the root at position 0, its
children in the next two positions, their children in
the next four.
A Sample Heap
●
Find the relatives
–
The left child of the node at index i is at index 2i + 1.
–
The right child is at index 2i + 2 (e.g. nodes 5 & 10)
–
The parent is at index:
Heaps
Heaps
Priority Queues
●
A heap is a good data structure for implementing
a priority queue.
–
We remove something from a regular queue and get
the oldest element, because first-in first-out (FIFO).
–
A priority queue, we get the smallest element.
–
The smallest element in a heap: it's always at index 0.
Priority Queues (to add new node 3)
●
●
●
●
Add something to a priority
queue by tacking it onto
the end.
Filtering the element up
toward the root until it is in
a valid position.
Even the worst case, this
takes time proportional to
the height of the tree.
Since the trees is perfect
or close to perfect, this is
in O(log n) time.
Priority Queue
- Code for adding an element into a priority queue
Priority Queue
Removing an Element from a Priority Queue
●
●
For example, remove the root
node from a priority queue (Fig
14-6)
–
Remove the current root node.
–
Move the element in the last
position as the root and filtered it
down until it is in a legitimate
position.
This takes O(log n) time
Priority Queue
- Code for removing an element from a priority queue
Priority Queue
Heapsort (Fig. 14-8)
●
●
●
●
●
●
●
A very useful sorting algorithm.
Call the Heap constructor to use the input elements
to create a heap.
Note, the constructor on lines 1-13 in Fig. 14-8 is an
overloaded constructor for the Heap class. The
other was in Figure 14-3.
Call the heapsort method to remove the root
(smallest) element from the heap, one at a time.
Call FliterDown() to make the heap valid.
The order to remove elements from the heap is in
the ascending order (i.e. the output elements are in
ascending order).
Heapsort algorithm is O(n log n)
Heapsort
Heaps
●
Java.util package contains a PriorityQueue
class.
Disjoint Set Clusters
●
Disjoint sets
–
Sets are disjoint if they have no elements in
common.
–
Clusters of disjoint sets include the sets of players
on different baseball teams, the set of cities in
different countries, and the sets of newspapers
owned by different media companies.
–
We want to allow more efficient performance by:
●
●
Determining whether two elements belong to the same
set.
Merge two sets.
Disjoint Set Clusters
●
up-tree
–
Cluster represents nodes in these trees keep
track of their parents.
Disjoint Set Clusters
●
To determine if two elements are in the same
set, they lead to the same root, the elements
are in the same tree.
Disjoint Set Clusters
●
Represent the up-trees at each position we
store the parent of the corresponding element.
Disjoint Set Clusters
Disjoint Set Clusters
Disjoint Set Clusters
●
All of the methods take constant time except
for findRoot()
–
Worst case Θ(n)
Disjoint Set Clusters
●
We can keep the trees shorter by making the
root of the shorter tree a child of the root of the
taller tree.
Disjoint Set Clusters
●
We need to keep track of the height of each
tree.
–
We need to keep track of heights only for the roots.
–
Store the height of each root in the array parents.
–
To avoid confusion between a root with height 3
and a node whose parent is 3, we store the heights
as negative numbers.
Disjoint Set Clusters
Disjoint Set Clusters
●
A tree might have height 0
–
The entry for the root of a tree of height h is -h-1
Disjoint Set Clusters
●
Path Compression
–
A second improvement
●
●
●
●
Suppose we determine that the root of the tree
containing 4 is 7.
We can make this operation faster next time by
making 7 the parent of 4.
In fact, we might as well make 7 the parent of every
node we visit on the way to the root.
We can't alter these parents until after we've found the
root.
Disjoint Set Clusters
Disjoint Set Clusters
Disjoint Set Clusters
●
O(log* n), log* n is the number of times we
have to take the logarithm to get down to 1.
Disjoint Set Clusters
Disjoint Set Clusters
Disjoint Set Clusters
●
The entire word set can be represented as a
digital search tree.
–
Words are represented as paths through the tree.
–
Each child of node is associated with a letter.
Disjoint Set Clusters
Disjoint Set Clusters
●
Whenever the user enters a letter, we
descend to the appropriate child.
–
If there is no such child, the user loses.
–
When we need to pick a letter, we randomly
choose a child and the corresponding letter.
Disjoint Set Clusters
Disjoint Set Clusters
Disjoint Set Clusters
Disjoint Set Clusters
Disjoint Set Clusters
Disjoint Set Clusters
Disjoint Set Clusters
Disjoint Set Clusters
Disjoint Set Clusters
Disjoint Set Clusters
●
getChild() is nothing more than a has table
lookup.
–
Makes a digital search tree an excellent choice
when searching for Strings containing a prefix
that grows one character at a time.
Red-Black Trees
●
A plain binary search tree performs poorly if
the tree is not balanced.
–
Worst case occurs if the elements are inserted in
order, the tree is linear, so search, insertion, and
deletion take linear time.
–
Red-black tree
●
–
Ensures that the tree cannot become significantly
imbalanced.
In the Java collections framework, the classes
TreeSet and TreeMap use red-black trees.
Red-Black Trees
●
Red-black tree is a binary search tree and the
node has a color, either red or black.
–
The root is black.
–
No red node has a red child.
–
All paths must contain the same number of black
nodes.
Red-Black Trees
●
These properties ensure that the tree cannot
be significatnly imbalanced.
–
The shortest path to a null child contains d nodes.
–
The longest path can contain at most 2d nodes.
–
Height of a red-black tree containing n nodes is in
O(log n).
–
Search in red-black trees is identical to search in
binary search trees.
Red-Black Trees
●
Start at the root and descend until we either
find the target or try to descend from a leaf.
–
In the latter case, the target is not present in the
tree, so we attach a new, red leaf.
–
The new node may be a child of another red node.
–
We fix this by working our way back up the tree,
changing colors and rearranging nodes.
–
Repair operation, complicated, time for insertion is
still in O(log n).
Red-Black Trees
A key step in tree repair is rotation.
●
When we rotate, we replace a node with one of
its children.
●
Red-Black Trees
●
Work back up the tree in several steps,
performing color changes and rotations.
–
Each step either fixes the tree or moves the
problem closer to the root.
–
If we get the root and still have a red node, we can
simply color it black.
–
How do we fix this?
●
Three cases to consider, depending on the color of the
node's parent's sibling and on whether node is a left or
right child.
Red-Black Trees
●
Node has a red aunt
–
Since the great-grandparent may also be red, we
may have to do some repair there, too, but we're
getting closer to the root.
Red-Black Trees
●
Node has a black aunt.
–
Outer-child.
●
No further work is necessary at this point.
Red-Black Trees
●
Node has a black aunt and is an inner child
–
New outer child is red and has a red parent, so we
can repair it as before.
Red-Black Trees
●
Splicing out a red node can never cause any
problems.
–
Let node be the child of the node that was spliced
out.
–
If node is red, we can simply color it black to cancel
out the problem.
–
If node is black, the subtree rooted at node's parent
is short a black node.
Red-Black Trees
●
Node's sibling is black and has two black
children
●
Repair closer to the root.
Red-Black Trees
●
Node's sibling is black and the outer child is red
–
No further work is necessary at this point.
Red-Black Trees
●
Node's sibling is black, has a black outer child
and has a red inner child.
–
Leads to the previous case.
Red-Black Trees
●
Fourth case node's sibling is red.
–
Lead to one of the other 3 cases.
Red-Black Trees
●
We use references to a special black node
called a sentinel.
–
Sentinel indicates that we can't go any farther.
–
We use a single sentinel instance to represent all
nonexistent children.
–
We keep track of the parent of each node.
–
The root of a parent is the sentinel.
Red-Black Trees
Red-Black Trees
Red-Black Trees
Red-Black Trees
Red-Black Trees
Red-Black Trees
Red-Black Trees
Red-Black Trees
Red-Black Trees
Red-Black Trees
Red-Black Trees
Red-Black Trees
Red-Black Trees
Red-Black Trees
Red-Black Trees
Red-Black Trees
Red-Black Trees
Red-Black Trees
Red-Black Trees
Red-Black Trees
Red-Black Trees
Red-Black Trees
Summary
●
A heap is a binary tree data structure used
either to represent a priority queue or in the
heapsort algorithm.
–
A heap is a perfect binary tree or close to it.
–
The value at each node is less than or equal to the
values at the node's children.
–
When changes are made to a heap, it is repaired by
filtering the offending element up or down until it is
in the right place.
Summary
●
●
A cluster of disjoint sets may be represented as
a forest of up-trees.
–
Two elements in the same set are in the same tree,
which can be detected by following parents up to
the root.
–
This data structure supports efficient algorithm for
determining compression, these operations take
amortized time in O(log*n)
A set of strings may be represented as a digital
search tree.
–
Each string corresponds to a path through the tree.
Summary
●
●
Java's TreeSet and TreeMap classes use redblack trees, which are similar to binary search
trees.
–
The colors in the tree must have certain properties,
which guarantee that the tree cannot be badly out
of balance.
–
Worst-case running time in O(log n) for search,
insertion, and deletion.
A special sentinel node is used in place of
absent parents and children, where there would
normally be null references.
Summary
●
Search in red-black tree works just as it does in
a binary search tree.
–
After the basic insertion and deletion operations, it
may be necessary to repair the tree to satisfy the
properties.
Chapter 14 Self-Study Homework
●
Pages: 377
●
A. Do Exercise 14.1
●
B. Use heapsort to sort the following integer set
{5, 2, 1, 4, 9, 8, 10, 7, 6, 3} in ascending order.
1. Use the input data to create a heap
2. Then output the root node one by one to sort the data set in
ascending order.
3. Show the heap in step 1 (the heap with 10 integers).
4. Show the heap after output each integer (9 heaps totally)