Premise
A hashmap has a time complexity of O(1) for all operations.
Problem
You have to find a constant time method to uniformly random selection of a key in a Hash map.
Assumptions
- The map can grow to memory size.
- You can use any readymade hash map.
Solution
I’m discussing the pseudo code for a Python solution. All operations work in constant time except remove()
.
randMap = {}
def insert(k, v):
randMap[k] = v
The random must pick a random key. But how? We can store all the keys in a list and then run random()
on the list indexes.
randMap = {}
indexMap = {}
keyList = []
last = 0
def insert(k, v):
randMap[k] = v
keyList.append(k)
last += 1
# Maintain index of each key
indexMap[k] = len(keyList)
def getRandom():
start = 0
end = len(keyList) - 1
randomIndex = random(start, end)
key = keyList[randomIndex]
return key
How do you delete a key?
This is the crux of the problem. Deletion of a key would need us to change the index array too. That means we need to shift all elements of the array and complexity would shoot to O(n).
def remove(k):
# Remove from the hash map
randMap.pop(k, None)
# Remove from key list
index = indexMap[k]
# This is O(n) operation
keyList.remove(index)
# Pop the index too
indexMap.pop(k)
How to improve deletion?
- Delete the entry and mark the entry invalid. The probability distribution does not change, however with more and more deletions, you have a sparse array. Thus you would need multiple
getRandom()
to get a valid key. - Move the last element to the deleted element. Adjust the index of the last element.
def remove(k):
# Remove from the hash map
randMap.pop(k, None)
index = indexMap[k]
indexMap[last] = index
# This is O(1) operation
keyList[index] = keyLast[last]
# Pop the index too
indexMap.pop(k)
last -= 1
This is a constant time solution.