HASH TABLES IN C

HASH TABLES
-Paritosh Gupta
Problem.
Required
Search for The Precious
• One way would be to map all the data. And get key-value pairs.
• This means providing a unique identifier to each item and then making the
compiler search when either is given to it..
• But that method is long as the compiler will still be searching and comparing
all the map values until it finds the black shirt.
Hash tables exist to speed the searching process
• Obtain a key, in this case the shirt
• Create a lot of buckets into which key/value pairs can be distributed.
• Choose a rule for assigning specific keys into specific buckets
The rule that we choose to associate keys with buckets is called a hash function
• Data structures that distribute items using hash function are called hash
tables
• Buckets are the cells in a hash tables
Overview of approach
• Analyze data
• Store data
Designing a hash function
Hash key to find bucket
• Retrieving data
Hash key
Lookup table
Hash Function design
• Hash functions are pre defined and differ for different data types
• A hash function gives out a particular value for each key.
• So for example passing xyzBlack Shirt to the function gives the
value=20
container
18
19
• We can then access memory location 20 to get the value
• Good hash function spreads out values to memory locations perfectly 20
randomly
21
• Create if not found
• But it is also possible that the hash function gives the same value for 22
another key.
value
The amazing
black shirt
Blue jeans
23
Cap
24
hat
What if two keys give the same hash value
(collision)
•
•
•
•
•
•
•
•
•
What can we do when two different values attempt to occupy the same place in an array?
Solution #1: Search from there for an empty location
Can stop searching when we find the value or an empty location
Search must be end-around
Solution #2: Use a second hash function
...and a third, and a fourth, and a fifth, ...
Solution #3: Use the array location as the header of a linked list of values that hash to this location
All these solutions work, provided:
We use the same technique to add things to the array as we use to search for things in the array
Sample of Chaining
Symtab[NHASH]
shoes
NULL
NULL
NULL
Name 1
Name 2
Value 1
Value 2
NULL
pants
NULL
Name 3
Value 3
Example
Efficiency
• Hash tables are actually surprisingly efficient
• Until the table is about 70% full, the number of probes (places looked at in the
table) is typically only 2 or 3
• Sophisticated mathematical analysis is required to prove that the expected cost of
inserting into a hash table, or looking something up in the hash table, is O(1
+n/m) where n/m is the loading factor. ( number of stored elements/number of
slots)
• Even if the table is nearly full (leading to long searches), efficiency is usually still
quite high
Uses
•
•
•
•
•
Compilers use hash tables to keep track of variables
browsers user it to keep track of recent websites or tabs
Use by processor to manage cache memory
Application in search algorithms.
Password search
Source
• �HYPERLINK
"http://www.cis.upenn.edu/~adhilton/cse399/hashtable.html"http://www.c
is.upenn.edu/~adhilton/cse399/hashtable.html
• course text-book
• http://www.stanford.edu/class/archive/cs/cs106b/cs106b.1126/lectures/18
/Slides18.pdf