hashValue = item.hashCode()

Data Structures for Java
William H. Ford
William R. Topp
Chapter 21
Hashing as a Map
Implementation
Bret Ford
© 2005, Prentice Hall
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Introduction to Hashing



A hash table distributes elements in a
series of linked lists, referred to as
buckets.
A hash function maps a value to an index
in the table. The function provides access
to an element much like an index provides
access to an array element.
Like a binary search tree, a hash table
provides an implementation of the Set and
Map interfaces.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Introduction to Hashing
(continued)


A binary search tree can access data
stored by value with O(log2n) average
search time.
We would like to design a storage
structure that yields O(1) average
retrieval time. In this way, access to an
item is independent of the number of
other items in the collection.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Introduction to Hashing
(continued)

A hash table is an array of references.


Associated with the table is a hash function
that takes a key as an argument and returns
an integer value.
By using the remainder after dividing the hash
value by the table size, we have a mapping of
the key to an index in the table.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Introduction to Hashing
(concluded)
Hash Value:
HashTable index:
hf(key) = hashValue
hashValue % n
hf(key) = hashValue
0
1
hashValue % n = i
i
key entry
n-1
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Using a Hash Function

Consider the hash function
hf(x) = x, where x is a nonnegative
integer (the identity function). Assume the
table is the array tableEntry with n = 7
elements.
hf(22) = 22
hf(4) = 4
22 % 7 = 1
0
1
tableEntry[1]
4%7=4
2
3
4
tableEntry[4]
5
6
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Using a Hash Function
(concluded)

With hash function hf() and table size n,
the table index for a key is i = hf(key)%n.
Collisions occur for any two keys that
differ by a multiple of n.
hf(22) = 22
hf(36) = 36
hf(5) = 5
hf(33) = 33
22 % 7 = 1
36 % 7 = 1
0
1
5%7=5
2
3
4
33 % 7 = 5
5
6
tableEntry[1]
tableEntry[5]
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Designing Hash Functions

Some general design principles guide the
creation of all hash functions.


Evaluating a hash function should be efficient.
A hash function should produce uniformly
distributed hash values. This spreads the hash
table indices around the table, which helps
minimize collisions.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Designing Hash Functions
(continued)

The Java programming language provides
a general hashing function with the
hashCode() method in the Object
superclass.
public int hashCode()
{ … }
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Designing Hash Functions
(continued)

Object's hashCode()converts the internal
address of the object into an integer
value, which has limited application since
two different objects will normally have
different values for hashCode(), even if
they store the same data.
// strings one and two are the same; not so for integer values
// one.hashCode() and two.hashCode()
String one = "java", two = "java";
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Designing Hash Functions
(continued)

The Integer class provides the identity
function for hashCode().
public int hashCode()
{ return value; }

Unless the integer data has random
characteristics, this is not a good hash
function.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Designing Hash Functions
(continued)

In the majority of hash-table applications,
the key is a string.

To create an efficient hash function, we must
combine the sequence of characters in the
string to form an integer.
public int hashCode()
{ int hash = 0;
for (int i = 0; i < n; i++)
hash = 31*hash + s[i];
return hash;
}
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Designing Hash Functions
(concluded)
The following are hash code values for three different strings.
The value for string strB is a negative number due to
integer overflow.
String strA = "and", strB = "uncharacteristically",
strC = "algorithm";
hashValue = strA.hashCode();
hashValue = strB.hashCode();
hashValue = strC.hashCode();
// hashValue = 96727
// hashValue = -2112884372
// hashValue = 225490031
In general, a hash function may result in integer overflow and
return a negative number. The following calculation insures that
the table index is nonnegative.
tableIndex = (hashValue & Integer.MAX_VALUE) % tableSize
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
User-Defined Hash Functions

To create a custom hash function, a
class overrides the method hashCode().

For the Time24 class, the hash value for an
object is its time converted to minutes. Since
hour and minute are normalized to fall within
the ranges 0 to 23 and 0 to 59 respectively,
each time is unique.
public int hashCode()
{
// hash value is time in minutes;
// as normalized time, value is positive
return hour*60 + minute;
}
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
User-Defined Hash Functions
(continued)

The custom hash function for Product
objects must mix the bits for the serial
number to create a random value.
public class Product
{
// last 4 digits record year in which the product was made.
// identity hash function is not sufficient
private int serialNum;
...
}
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
User-Defined Hash Functions
(concluded)
public class Product
{
private int serialNum;
...
public int hashCode()
{
// assign serialNum to a long variable
long hashValue = serialNum;
// square to obtain a nonnegative long integer
hashValue *= hashValue;
// return the remainder after dividing
// by the largest int value; its bits
// are "jumbled up"
return (int)(hashValue % Integer.MAX_VALUE);
}
}
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Designing Hash Tables

When two or more data items hash to the
same table index, they cannot occupy the
same position in the table.

We are left with the option of locating one of
the items at another position in the table
(linear probing) or of redesigning the table to
store a sequence of colliding keys at each
index (chaining with separate lists) .
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Linear Probing

The hash table is an array of elements
with an associated hash function. To
add an item



Initially, tag each entry in the table as
"empty".
Apply the hash function to the key and divide
the value by the table size to obtain a table
index. If the entry is empty, insert the item.
Otherwise, start at the next hash index and
scan successive indices, wrapping around to
the start of the table after probing the last
table entry. An insertion occurs at the first
open location.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Linear Probing (continued)

The search returns to the original hash
location without finding an open slot, the
table is full, and the linear probing algorithm
throws an exception.
tableIndex = x % 11
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Linear Probing (continued)
// compute hash index of item for a table of size n
int index = (item.hashCode()&Integer.MAX_VALUE)%n, origIndex;
// save the original hash index
origIndex = index;
// cycle through the table looking for an empty slot, a
// match or a table full condition (origindex == index).
do
{
// test whether the table slot is empty or the key matches
// the data field of the table entry
if table[index] is empty
insert item in table at table[index] and return
else if table[index] matches item
return
// begin a probe starting at the next table location
index = (index+1) % n;
} while (index != origIndex);
// we have gone around table without finding match or open slot
throw new BufferOverflowException();
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Linear Probing (concluded)

If the size of the table is large relative to
the number of items, linear probing works
well, because a good hash function
generates indices that are evenly
distributed over the table range, and
collisions will be minimal. As the ratio of
table size to the number of items
approaches 1, the algorithm deteriorates
to the sequential search.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Chaining with Separate Lists

Chaining with separate lists defines the
hash table as an indexed sequence of
linked lists. Each list, called a bucket,
holds a set of items that hash to the same
table location.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Chaining with Separate Lists
(continued)

A bucket is a singly linked list. Each entry
of the array is the first node in a sequence
of items that hash to the table index. A
node has the familiar structure with two
fields, one for the value and one for the
reference to the next node.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Chaining with Separate Lists
(continued)

To add object item, use the hash function
to identify the index of the appropriate
bucket in the array (table).


If table[i] is null, add item as the first entry in
the list.
Otherwise begin with the first node, entry =
table[i], and compare item with
entry.nodeValue. If there is no match,
continue the scan with node entry.next, and
so forth. If item is not in the list, add it to the
front of the list.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Chaining with Separate Lists
(continued)
Consider the following sequence of eight elements
{54, 77, 94, 89, 14, 45, 35, 76} with the identity hash function and tableSize = 11.
The figure displays the lists. Each entry in a table includes the number of probes to
add the element.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Chaining with Separate Lists
(concluded)



Chaining with separate lists is generally
faster than linear probing since chaining
only searches items that hash to the same
table location.
With linear probing, the number of table
entries is limited to the table size, whereas
the linked lists used in chaining grow as
necessary.
To delete an element, just erase it from
the associated list.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Rehashing

As the number of entries in the hash
table increases, search performance
deteriorates. Rehashing increases the hash
table size when the number of entries in the
table is a specified percentage of its size.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
A Hash Table as a Collection

The generic class Hash stores elements
in a hash table using chaining with
separate lists and implements the
Collection interface.



hashCode() must be provided by the generic
type.
The constructor creates a hash table with
initial size 17. The table grows as rehashing
occurs.
The method toString() returns a commaseparated list that, by the nature of hashing,
is not ordered.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
A Hash Table as a Collection
(concluded)
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Class Implementation

The hash table is an array whose
elements are the first node in a singly
linked list.

Define an inner class Entry with an integer
field hashValue that stores the hash code
value and avoids recomputing the hash
function during rehashing.
hashValue = item.hashCode() & Integer.MAX_VALUE;
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Entry Inner Class
private static class Entry<T>
{
// value in the hash table
T value;
// save value.hashCode() & Integer.MAX_VALUE
int hashValue;
// next entry in the linked list
// of colliding values
Entry<T> next;
// entry with given data and node value
Entry(T value, int hashValue, Entry<T> next)
{
this.value = value;
this.hashValue = hashValue;
this.next = next;
}
}
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Class Instance Variables



The Entry array, table, defines the
singly-linked lists that store the elements.
The integer variable hashTableSize
specifies the number of entries in the
table.
The variable tableThreshold has the value
(int)(table.length * MAX_LOAD_FACTOR)
where the double constant
MAX_LOAD_FACTOR specifies the
maximum allowed ratio of the elements in
the table and the table size.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Class Instance Variables
(concluded)


MAX_LOAD_FACTOR = 0.75 (number of
hash table entries is 75% of the table
size) is generally a good value. When the
number of elements in the table equals
tableThreshold, a rehash occurs.
The variable modCount is used by
iterators to determine whether external
updates may have invalidated the scan.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Class Constructor

The Hash class constructor creates the
17-element array table with 17 empty
lists. A rehash will first occur when the
hash collection size equals 12.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Class Outline
public class Hash<T> implements Collection<T>
{
// the hash table
private Entry[] table;
private int hashTableSize;
private final double MAX_LOAD_FACTOR = .75;
private int tableThreshold;
// for iterator consistency checks
private int modCount = 0;
// construct an empty hash table with 17 buckets
public Hash()
{
table = new Entry[17];
hashTableSize = 0;
tableThreshold =
(int)(table.length * MAX_LOAD_FACTOR);
}
. . .
}
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Class add()

The algorithm for add():


Compute the hash index for the parameter
item and scan the list to see if item is
currently in the hash table. If so, return false.
Create a new Entry with value item and insert
it at the front of the list.

hashValue is assigned to the entry so it will not
have to be computed when rehashing occurs.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Class add() (continued)

Increment hashTableSize and modCount.
If hashTableSize ≥ tableThreshold,
call rehash(). The size of the new table is
2*table.length + 1
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash add() (continued)
// add item to the hash table if it is not
// already present and return true; otherwise,
// return false
public boolean add(T item)
{
// compute the hash table index
int hashValue = item.hashCode() &
Integer.MAX_VALUE,
index = hashValue % table.length;
Entry<T> entry;
// entry references the front of a linked
// list of colliding values
entry = table[index];
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash add() (continued)
// scan the linked list and return false
// if item is in list
while (entry != null)
{
if (entry.value.equals(item))
return false;
entry = entry.next;
}
// we will add item, so increment modCount
modCount++;
// create the new table entry so its successor
// is the current head of the list
entry = new Entry<T>(item, hashValue,
(Entry<T>)table[index]);
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash add() (concluded)
// add it at the front of the linked list
// and increment the size of the hash table
table[index] = entry;
hashTableSize++;
if (hashTableSize >= tableThreshold)
rehash(2*table.length + 1);
// a new entry is added
return true;
}
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Class rehash()

The method rehash() takes the size of
the new hash table as an argument
performs rehashing.

Create a new table with the specified size and
cycle through the nodes in the original table.
For each node, use the hashValue field
modulo the new table size to hash to the new
index. Insert the node at the front of the
linked list.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Class rehash()
(continued)
private void rehash(int newTableSize)
{
// allocate the new hash table and
// record a reference to the current
// one in oldTable
Entry[] newTable = new Entry[newTableSize],
oldTable = table;
Entry<T> entry, nextEntry;
int index;
// cycle through the current hash table
for (int i=0; i < table.length; i++)
{
// record the current entry
entry = table[i];
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Class rehash()
(continued)
// see if there is a linked list present
if (entry != null)
{
// have at least one element in a linked list
do
{
// record the next entry in the
// original linked list
nextEntry = entry.next;
// compute the new table index
index = entry.hashValue % newTableSize;
// insert entry the front of the
// new table's linked list at
// location index
entry.next = newTable[index];
newTable[index] = entry;
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Class rehash()
(concluded)
// assign the next entry in the
// original linked list to entry
entry = nextEntry;
} while (entry != null);
}
}
// the table is now newTable
table = newTable;
// update the table threshold
tableThreshold =
(int)(table.length * MAX_LOAD_FACTOR);
// let garbage collection get rid of oldTable
oldTable = null;
}
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash remove()

Compute the hash table index. Using
variables prev and curr that move through
the linked list in tandem, search for item.
If not present, return false; otherwise,
remove item from the list. If prev == null,
this involves updating table[index] to
reference the successor to the front of the
list. Decrement hashTableSize, increment
modCount, and return true.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash remove() (continued)
public boolean remove(Object item)
{
// compute the hash table index
int index = (item.hashCode() &
Integer.MAX_VALUE) % table.length;
Entry<T> curr, prev;
// curr references the front of a
// linked list of colliding values;
// initialize prev to null
curr = table[index];
prev = null;
// scan the linked list for item
while (curr != null)
if (curr.value.equals(item))
{
// we have located item and will remove
// it; increment modCount
modCount++;
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash remove() (continued)
// if prev is not null, curr is not the front
// of the list; just skip over curr
if (prev != null)
prev.next = curr.next;
else
// curr is front of the list; the
// new front of the list is curr.next
table[index] = curr.next;
// decrement hash table size and return true
hashTableSize--;
return true;
}
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash remove() (concluded)
else
{
// move prev and curr forward
prev = curr;
curr = curr.next;
}
return false;
}
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Class Iterators

Search the hash table for the first
nonempty bucket in the array of linked
lists. Once the bucket is located, the
iterator traverses all of the elements in the
corresponding linked list and then
continues the process by looking for the
next nonempty bucket. The iterator
reaches the end of the table when it
reaches the end of the list for the last
nonempty bucket.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Class Iterators (continued)

Iterator objects are instances of the
inner class IteratorImpl whose variables
are:




Integer index that identifies the current
bucket (table[index]) scanned by the iterator.
The Entry reference next pointing to the
current node in the current bucket.
The variable lastReturned that references the
last value returned by next().
The iterator variable expectedModCount used
in conjunction with the collection variable
modCount.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Class Iterators
(continued)
// inner class that implements hash table iterators
private class IteratorImpl implements Iterator<T>
{
// next entry to return
Entry<T> next;
// to check iterator consistency
int expectedModCount;
// index of current bucket
int index;
// reference to the last value returned by next()
T lastReturned;
. . .
}
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Class Iterators
(continued)

The elements enter the collection in the
order (19, 32, 11, 27) using the identify
hash function. The iterator visits the
elements in the order (11, 32, 27, 19).
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Iterator Constructor

A loop iterates up the list of buckets
until it locates the first nonempty bucket.
The loop variable i becomes the initial
value for index and table[i] references the
front of the list. This is the initial value for
next.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Iterator Constructor
(concluded)
IteratorImpl()
{
int i = 0;
Entry<T> n = null;
// the expected modCount starts at modCount
expectedModCount = modCount;
// find the first nonempty bucket
if (hashTableSize != 0)
while (i < table.length &&
((n = table[i]) == null))
i++;
next = n;
index = i;
lastReturned = null;
}
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Iterator next()


The method next() first determines
that the operation is valid by checking that
modCount and expectedModCount are
equal and that we are not at the end of
the hash table.
If the iterator is in a consistent state,
next() saves entry.value in lastReturned
and uses a loop index i and entry to
perform the iterator scan for the
subsequent element in the hash table. The
return value is lastReturned.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Iterator next() (continued)
public T next()
{
// check for iterator consistency
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
// we will return the value in Entry object next
Entry<T> entry = next;
// if entry is null, we are at the end of the table
if (entry == null)
throw new NoSuchElementException();
// capture the value we will return
lastReturned = entry.value;
// move to the next entry in the current
// linked list
Entry<T> n = entry.next;
// record the current bucket index
int i = index;
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Iterator next()
(concluded)
if (n == null)
{
// we are at the end of a bucket; search for the
// next nonempty bucket
i++;
while (i < table.length &&
(n = table[i]) == null)
i++;
}
index = i;
next = n;
return lastReturned;
}
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Iterator remove()


The remove() method first determines
that the operation is valid by checking that
lastReturned is not null and that
modCount and expectedModCount are
equal.
If all is well, the iterator remove() method
calls the Hash class remove() method with
lastReturned as the argument. By
assigning to expectedModCount the
current value of modCount, the iterator
remains consistent.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Iterator remove()
(concluded)
public void remove()
{
// check for a missing call to next() or previous()
if (lastReturned == null)
throw new IllegalStateException(
"Iterator call to next() " +
"required before calling remove()");
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
// remove lastReturned by calling remove() in Hash;
// this call will increment modCount
Hash.this.remove(lastReturned);
expectedModCount = modCount;
lastReturned = null;
}
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
The HashMap Collection

The design of the HashMap collection is
similar to the implementation of TreeMap.
A HashMap is not ordered since the
position of elements depends on hashing
the keys. This affects the method
toString() which returns a listing of the
elements based on the iterator order.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
The HashMap Collection
(continued)

The HashMap class stores elements in a
hash table containing linked lists of Entry
objects. The inner class Entry contains
key-value pairs.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
The HashMap Collection
(continued)

The inner class Entry implements the
Map.Entry interface which defines the
methods getKey(), getValue() and
setValue(). A toString() method returns a
representation of an entry in the format
"key=value". The constructor has
arguments for each field in the node.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Entry Class
(partial listing)
static class Entry<K,V> implements Map.Entry<K,V>
{
K key;
V value;
Entry<K,V> next;
int hashValue;
// make a new entry with given key, value
Entry(K key, V value, int hashValue,
Entry<K,V> next)
{
this.key = key;
this.value = value;
this.hashValue = hashValue;
this.next = next;
}
...
}
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Accessing Entries in a HashMap

The methods get(), and containsKey()
take a key reference argument and must
locate a corresponding entry in the map.

This task is performed by the private
HashMap method getEntry() which takes a
key as an argument, applies the hash function
to the key and searches the resulting list for a
key-value pair with the same key.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Accessing Entries in a HashMap
(continued)
// return a reference to the entry with the specified key
// if there is one in the hash map; otherwise, return null
public Entry<K,V> getEntry(K key)
{
int index = (key.hashCode() &
Integer.MAX_VALUE) % table.length;
Entry<K,V> entry;
entry = table[index];
while (entry != null)
{
if (entry.key.equals(key))
return entry;
entry = entry.next;
}
return null;
}
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Accessing Entries in a HashMap
(concluded)
// returns the value that corresponds to
// the specified key
public V get(K key)
{
Entry<K,V> p = getEntry(key);
if (p == null)
return null;
else
return p.value;
}
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Updating Entries in a HashMap

The method put() updates the HashMap.


Construct a table index by applying the hash
function for the key and scan the linked list
for a match with the key. If a match occurs,
apply setValue() and return its result.
If key does not occur in the list, insert a new
Entry object at the front of the linked list. If
the hash map size has reached the table
threshold, apply rehashing. Conclude by
returning null.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Updating Entries in a HashMap
(continued)
// assigns value as the value associated with key
// in this map and returns the previous value
// associated with the key, or null if there
// was no mapping for the key
public V put(K key, V value)
{
// compute the hash table index
int hashValue = key.hashCode() & Integer.MAX_VALUE,
index = hashValue % table.length;
Entry<K,V> entry;
// entry references the front of a linked
// list of colliding values
entry = table[index];
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Updating Entries in a HashMap
(continued)
// scan the linked list. if key matches the key in an
// entry, return entry.setValue(value). this
// replaces the value in the entry and returns the
// previous value
while (entry != null)
{
if (entry.key.equals(key))
return entry.setValue(value);
entry = entry.next;
}
// we will add item, so increment modCount
modCount++;
// create the new table entry so its successor
// is the current head of the list
entry = new Entry<K,V>(key, value, hashValue,
(Entry<K,V>)table[index]);
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Updating Entries in a HashMap
(concluded)
// add it at the front of the linked list
// and increment the size of the hash map
table[index] = entry;
hashMapSize++;
if (hashMapSize >= tableThreshold)
rehash(2*table.length + 1);
return null;
// a new entry is inserted
}
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Summary of HashMap Design
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
HashSet Class

The HashSet class uses a HashMap by
composition. The class defines a static
Object reference called PRESENT. This
becomes the value component for each
entry in the map. The constant reference
serves as a dummy placeholder in an
entry pair.

Declare a private instance variable map of
type HashMap having T as the type of the set
elements and Object as the value type. The
constructor instantiates the map collection.
This has the effect of creating an empty set.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
HashSet Class (continued)
public class HashSet<T> implements Set<T>
{
// value for each key in the map
private static final Object PRESENT = new Object();
// set implemented using a hash map
private HashMap<T, Object> map;
// create an empty set object
public HashSet()
{ map = new HashMap<T,Object>(); }
. . .
}
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
HashSet add()

The set methods are implemented with
map methods that use the
entry <item, PRESENT> as the argument.

add() uses the map method put(). If a
duplicate exists, then put() simply updates the
value field of the entry to PRESENT which is
its current value. The map method returns
null if a new element is added, so a return
value of null indicates that the add() inserted
item.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
HashSet add() (concluded)
public boolean add(T item)
{
return map.put(item, PRESENT) == null;
}
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
HashSet iterator()

The HashSet iterator must traverse the
keys in the map. Implement the method
iterator() by returning an iterator for the
key set collection view of the map.
// returns an iterator for the elements in the set
public Iterator<T> iterator()
{
return map.keySet().iterator();
}
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
HashSet remove()

The HashSet remove() method calls
the remove() method for the map. To
determine whether an element was
removed from the set, verify that the
return value from the map remove() call is
the reference PRESENT.
public boolean remove(Object obj)
{
return map.remove(obj) == PRESENT;
}
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Table Performance


A good hash function provides a uniform
distribution of hash values.
Hash table performance is measured by
using the load factor  = n/m, where n is
the number of elements in the hash table
and m is the number of buckets.


For linear probe, 0 ≤  ≤ 1.
For chaining with separate lists, it is possible
that  > 1.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Table Performance
(continued)

The worst case linear probe or chaining
with separate lists occurs when all data
items hash to the same table location. If
the table contains n elements, the search
time is O(n), no better than that for the
sequential search.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Table Performance
(continued)

Assume that the hash function uniformly
distributes indices around the hash table.

We can expect  = n/m elements in each
bucket.
On the average, an unsuccessful search makes 
comparisons before arriving at the end of a list and
returning failure.
 Mathematical analysis shows that the average
number of probes for a successful search is
approximately 1 + /2.

© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Hash Table Performance
(concluded)

Assume the number of elements n in the
hash table is bounded by some amount,
say, R*m, where m is the table size.

In this case,  = n/m  (R*m)/m = R, and the
following relationships hold for the average
cases, so the average running time is O(1)!
S  1 + /2 ≤ 1 + R/2
U =  ≤ R
(Successful Search)
(Unsuccessful Search)
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Evaluating Ordered and
Unordered Sets and Maps

Use an ordered set or map if an iteration
should return elements in order (average
search O(log2n). Use an unordered set or
map when fast access and updates are
needed without any concern for the
ordering of elements (average search time
O(1)).
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Timing Example

Program SearchComp.java:




Reads a file of 25025 randomly ordered
words and inserts each word into a TreeSet
and into a HashSet.
Determines the amount of time required to
build both of the data structures.
Shuffles the input from the file and times a
search of the TreeSet and HashSet for each
word in the shuffled input.
Displays the time required for each search
technique.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.
Timing Example (concluded)
Run:
Number of words is 25025
Built TreeSet in 0.078 seconds
Built HashSet in 0.047 seconds
TreeSet search time is 0.078 seconds
HashSet search time is 0.016 seconds
Note that the HashSet search time is
considerably better than that for a TreeSet.
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.