Searches and your LAST Data Structure: The Hash Table Unsorted Array [0] 66 [1] 5 [2] 90 [3] 55 [4] 1000 [5] 67 [6] 3 [7] 9 [8] 23 [9] 88 What is the most efficient method of searching for a value in an unsorted list? Sequential Search – Start at one end and work through the list sequentially O(n) What is the Big O value? On average, how many steps does it take to find a value if there are: 100 items in the array? 50 1,000 items? 500 1,000,000 items? 500,000 n items? n/2 Sorted Array [0] 3 [1] 5 [2] 9 [3] 23 [4] 55 [5] 66 [6] 67 [7] 88 [8] 90 [9] 1000 What is the most efficient method of searching a sorted list? Binary Search – Check the middle index and decide the next search direction based on: •Higher •Lower •Done if found O(log n) What is the Big O value? How many steps (on average) if there are: 100 items? 1,000 items? 1,000,000 items? n items? 7 10 20 log2n Illustrating a Binary Search in an array A Binary Search: Looking for 66 3 5 9 23 55 ? 66 ? YAY! 67 88 90 ? 1000 That was fast. But can we do better? • Linear search – ok • Binary search – really good • But an O(1) search? Is it even possible? Hash Tables (a final search) Remember Maps in Java? • Key, Value pair • If the school wants to store your student info, what is the key? What is the value? • If the government want to store all of someone’s financial records (credit score, savings, debt, etc.), what is the key? What is the value? Space vs. Time Tradeoff • Uses extra space – roughly 1.5 times the total number of items you want to store • Let’s say that there are 1800 students. • We would need an array of about 2700 Student Records. 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 null null null null null null null null null null null null null null null null null null null null null 2694 2695 2696 2697 2698 2699 null null null null null null The speed of a hashMap • If you have a student number, you can instantly (O(1) time) add the StudentRecord to the array! • If you have the student number, you can instantly (O(1) time) get the StudentRecord from the array! • Is this really possible? 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 null null null null null null null null null null null null null null null null null null null null null 2694 2695 2696 2697 2698 2699 null null null null null null How does it work? • Create a “bucket” – key, value pair • Use the key and some hash function magic 160234 160884 Student ID 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 null null null null null null null null null null null null null null 160234 null null null null null null null hash [ [2694 14 ] ] index • Each key generates an index value 2694 160884 null 2695 null 2696 null 2697 null 2698 null 2699 null Hash Function Quality • What happens with a GOOD hash function? Each key generates a unique index What is the obvious next question? Hash Function Quality • What happens with a BAD hash function? What if different items map to the same index value? 160234 170138 Student ID 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 null null null null null null null null null null null null null null 160234 170138 null null null null null null null 2694 2695 2696 2697 2698 2699 null null null null null null hash [ [14 14 ]] index Oh No!!!! COLLISIONS are BAD!!! Collisions • A Collision occurs when the hash function maps more than one different item to the same index value. 170138 also maps to 14! Ways to handle collisions: • Make the array 2-dimensional • Linear Collision Processing – walk sequentially to next open location • Quadratic Collision Processing – Jump in a predetermined quantity to the next open location 170138 170138 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 null null null null 160249 null null null null null null null null null null null 160330 null 160273 null 160268 null null null null null null null 160234 null 170138 null null null null null null null 160295 null null null null null 2694 160029 2695 null 2696 null 2697 null 2698 160192 2699 null null null null null null null Collisions Most common approach: Chaining – Use an array of linked lists instead of an array of Student Records [0] [1] 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 null null 160249 null null null null null 160330 160273 160268 null null null 160234 170138 null null null null 160295 null null [2] [3] [4] [5] [6] [7] 2694 160029 2695 null 2696 null 2697 null 2698 160192 2699 null Big O Values What is the best case Big O value of adding or retrieving from a hash table? O(1) Worst case? O(n) [0] [1] 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 null null 160249 null null null null null 160330 160273 160268 null null null 160234 170138 null null null null 160295 null null [2] [3] [4] [5] [6] [7] 2694 160029 2695 null 2696 null 2697 null 2698 160192 2699 null So how does the hash function work? • The God Object gives every object a hashCode() method • It returns an int – we will look at String’s -381446182 • “Daniel McKeen” returns: ____________ • How can I instantly convert that number to (and absolute a value from 0 to 2699? Modulus! value) • The hash function uses the hashCode method and modulus to generate an int in the proper range. 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 null null 160249 null null null null null 160330 160273 160268 null null null 160234 null null null null 160295 null null 2694 160029 2695 null 2696 null 2697 null 2698 160192 2699 null Hash Coded Data Storage – Key Points: 1. Must have a hash method to convert key data to array index – Java gives us hashCode() -- Use of modulus (%) is common 2. Must know the maximum number of items to be stored 3. Must have a method for dealing with collisions 4. Alternate collision coverage methods: 1. Linear or Quadratic Collision Processing 2. Adding more dimensions to the array for collisions 3. Make the hash table an array of pointers -- Chaining 5. Big O value – Best and worst cases 6. Examples: Bar coding, VIN numbers on vehicles Exercise One Example – Below is a Hash Table. It is a size 5 array of ListNode pointers that will be used to store names. The Hash Function takes the integer value of the middle letter % 5. If there are an even number of letters, use the letter to the left of middle. (Treat ‘a’ as 1.) Question 1: Where would the name ‘Jacob’ be stored? [0] Middle letter c = 3 3 mod 5 = 3 [1] [2] [3] [4] Jacob Hash Table using Chaining: Size 5 Array – Each element in the array is a ListNode pointer Hash Function: Integer value of middle letter mod 5 – If even number of letters, use letter to the left of middle. Add the following names to the hash table: Ellen, Ann, George, Fred, Susan and draw the table [0] [1] [2] [3] [4] Hash Table: Size 5 Array – Each element is a ListNode pointer. Hash Function: Integer value of middle letter mod 5 – If even number of letters, use letter to the left of middle. Add the following names to the hash table: Ellen, Ann, George, Fred, Susan and draw the table [0] [1] [2] [3] [4] Ellen Hash Table: Size 5 Array – Each element is a ListNode Pointer Hash Function: Integer value of middle letter mod 5 – If even number of letters, use letter to the left of middle. Added: Ellen, Ann, George, Fred Remaining: Susan [0] George [1] [2] Ellen [3] Fred [4] Ann Hash Table: Size 5 Array – Each element is a ListNode pointer Hash Function: Integer value of middle letter mod 5 – If even number of letters, use letter to the left of middle. Add the following names to the hash table: Ellen, Ann, George, Fred, Susan and draw the table [0] George [1] [2] Ellen [3] Fred [4] Susan Ann Exercise Two Hash Function: Determine a hash function that could result in the following table [0] 24 [1] [2] 10 18 21 5 [3] [4] [5] [6] [7] Hash Function Solution: 31 Value modulus 8 45 Let’s let you battle it out in your own hash table! • Great explanation: http://scientopia.org/blogs/goodmath/2013/1 0/20/basic-data-structures-hash-tables/ • Image sources: • https://www.cs.auckland.ac.nz/software/Alg Anim/hash_tables.html Unused Slides Searching Techniques Worst case: Sequential vs Binary A Linear Search 1 2 3 4 …15 5 1 A Binary Search 2 3 4 • Picture an array • Illustrate converting unique ID to array index number
© Copyright 2026 Paperzz