Our hash table (found in hash_map.c in libutils) is of fixed size 8192. So basically this is not a proper hash table. Is also has some more disadvantages. This task is about refactoring it.
- Make our hash table number of buckets auto-expanding (and maybe contracting) according to the current load factor. Careful here:
- Power-of-two total bucket count is faster, but requires very good hash function
- Prime number total bucket count is more common and safer choice (avoids collisions even with trivial hash functions), but requires more careful and slower max-bounding of the hash function result (will not work without fixing the secondary issue mentioned later)
- Change the starting bucket count to a number smaller than 8192, since it is so commonly used this should save memory.
- Run map_test.c before and after your changes, update it as needed and write whatever new tests you want
Relevant, but secondary and somewhat more tricky issue:
- Currently all our hash functions have an argument for "max" bounding, according to the number of buckets. This does not work currently for non-power-of-2 max values, and it's duplicated code all over the place.
- So max-bounding should be moved to a function of its own, called by hash_map.c to limit the hash_fn() result. And if a non-power-of-2 total bucket number is used, the max-bounding function should work on that.
- the hash functions used for the various hash tables instantiated are all over the code base and passed to the hash table as hash_fn; However you can the find them via grep'ing for TYPED_MAP_DEFINE, in general the mostly used hash function is StringHash(),
Any other improvements you think should be done? Either fix or note them down as TODOs.