Lab 8: HashMap

Introduction
MyHashMap
So… How Fast Is It (Redux)?
Optional Exercises
Lab Debrief and Submission

Introduction

In this lab, you’ll create MyHashMap, an implementation of the Map61B interface, which represents a hash map. This will be very similar to Lab 7, except this time we’re building a Hash Map rather than a Tree Map.

After you’ve completed your implementation, you’ll compare the performance of your implementation to a list-based Map implementation ULLMap as well as the built-in Java HashMap class (which also uses a hash table).

MyHashMap

Create a class MyHashMap that implements the Map61B interface. You must do this in a file named MyHashMap.java. Your implementation is required to implement all of the methods given in Map61B except for remove. For this method you should throw an UnsupportedOperationException. Note that you should implement keySet and iterator this time, where iterator returns an Iterator that iterates over the stored keys (and both of these functions may return the keys in any order). For these methods, we recommend you simply create a HashSet instance variable that holds all your keys.

Additionally, you should implement the following constructors:

public MyHashMap();
public MyHashMap(int initialSize);
public MyHashMap(int initialSize, double loadFactor);

Your MyHashMap should initially have a number of buckets equal to initialSize. You should increase the size of your MyHashMap when the load factor exceeds the set loadFactor. If initialSize and loadFactor aren’t given, you should set defaults initialSize = 16 and loadFactor = 0.75 (as Java’s built-in HashMap does). When resizing, make sure to multiplicatively increase the size, not additively (e.g. multiply by 2, don’t add 100 or something). You are not required to resize down. Your MyHashMap operations should all be constant amortized time, assuming that the hashCode of any objects inserted spread things out nicely (hint: every Object in Java has its own hashCode method).

If the same key is inserted more than once, the value should be updated each time. You can assume null keys will never be inserted.

You should handle collisions with separate chaining. You may not import any libraries other than ArrayList, LinkedList, HashSet, iterator and Set.

You can test your implementation using TestMyHashMap.java.

You may find the following resources useful:

HashMap code from pages 136 and 137 of Data Structures Into Java, from our course references page.
Chapter 3.4 of our optional Algorithms textbook.
HashTable code code from our optional textbook.
ULLMap.java (provided), a working unordered linked list based Map61B implementation.
Lecture 19 slides.

One thing to note before getting started is that because of a problem similar to the one you saw when writing ArrayDeque, you will need to initialize your buckets as something like (List<T>[]) new ArrayList[size] if you want to create an array of lists since Java has some weird restrictions around generic types.

So… How Fast Is It (Redux)?

There are two interactive speed tests provided in InsertRandomSpeedTest.java and InsertInOrderSpeedTest.java. Do not attempt to run these tests before you’ve completed MyHashMap. Once you’re ready, you can run the tests in IntelliJ.

The InsertRandomSpeedTest class performs tests on element-insertion speed of your MyHashMap, ULLMap (provided), and Java’s built-in HashMap. It works by asking the user for an input size N, then generates N Strings of length 10 and inserts them into the maps as <String, Integer> pairs.

Try it out and see how your data structure scales with N compared to the naive and industrial-strength implementations. Record your results in a file named speedTestResults.txt. There is no standard format required for your results, and there is no required number of data points.

Now try running InsertInOrderSpeedTest, which behaves similarly to InsertRandomSpeedTest, except this time the Strings in <String, Integer> key-value pairs are inserted in lexicographically-increasing order. Note that unlike lab7, your code should be in the rough ballpark of Java’s built in solution – say within a factor of 10 or so. What this tells us is that state-of-the-art HashMaps are relatively easy to implement compared to state-of-the-art TreeMaps. When would it be better to use a BSTMap/TreeMap instead of a HashMap? Discuss this with your labmates, and add your answer to speedTestResults.txt.

Optional Exercises

This will not be graded, but you can still receive feedback on the autograder.

Implement the methods remove(K key) and remove(K key, V value), in your MyHashMap class. For an extra challenge, implement keySet() and iterator without using a second instance variable to store the set of keys.

For remove, you should return null if the argument key does not exist in the MyHashMap. Otherwise, delete the key-value pair (key, value) and return value.

Lab Debrief and Submission

At the end of lab, your TA will go over the reference solution. This will be helpful if you haven’t finished the lab, since we don’t want you to be stuck working on lab too much outside of lab. (This is also an incentive for you to go to lab!)

Make sure to submit your completed MyHashMap.java and speedTestResults.txt, and submit through git and Gradescope as usual.