Hash Table
Hash Table
Hash table is a actually a array .
API
It’s all O(1) time and space complexity for each function.
- insert
- search
- get
- delete
Hash Function
- Single Function
- Double hash Function
Hash Collision
what is hash collision
It’s collision that different key has same hash value.
hash is a kind of summary algorithm which compress information meaning that it’s definitely possible to collision.
How to solve collision
- Linked hash table. Used by Linux kernel. Mostly used other application.
- linear detect to find next place.
- multi-level hash table.
- Cuckoo hash table.
Cuckoo hash table
Each element in the hash table is a fixed size array called bucket.
After the first hashing, looking for the index in the bucket, If it’s collision, it will look for another bucket’s place rather than reallocate space.
Advantage
- Best performance and convenient for lock-free : multi-level hash table
- best shared between processes : Cuckoo hash table. because of its fixed size, which is easy to map file to memory.
- less memory usage : linked hash table
Load Factor: When to rehashing
Way to rehash
it’s time to rehash after element rate is bigger than load factor, .
- block rehashing
- asynchronous rehashing
- double hash table for quick rehashing
- lazy rehashing: Used by Redis
Asynchronous Rehashing
- start a new process to execute the task copying from origin table to bigger one
- record incoming request while rehashing
- using message queue or something else.
Double hash table for quick rehashing
execute request for both origin table and bigger one.
replace origin one by the bigger when it’s time to rehash and reallocate a bigger than current origin table.
Lazy Rehashing
when reach the rate, allocate the space for the bigger table.
move the associated element when next request is coming.