Hash Table Implementation in C/C++
A hash table in C/C++ is a data structure that maps keys to values. A hash table uses a hash function to compute indexes for a key. You can store the value at the appropriate location based on the hash table index.
Benefits of Hash Tables
The benefit of using a hash table is its very fast access time. Typically, the time complexity (amortized time complexity) is a constant O(1) access time.
If two different keys get the same index, you will need to use other data structures (buckets) to account for these collisions. If you choose a very good hash function, the likelihood of a collision can be negligible.
The C++ STL (Standard Template Library) has the std::unordered_map()
data structure.
Constructing a Hash Table
In this article, you will construct a hash table from scratch comprised of:
- A hash function to map keys to values.
- A hash table data structure that supports insert, search, and delete operations.
- A data structure to account for a collision of keys.
Choosing a Hash Function
The first step is to choose a reasonably good hash function that has a low chance of collision. However, for the purposes of this tutorial, a poor hash function will be applied to better illustrate hash collisions. This limited example will also only utilize strings (or character arrays in C).
Example: Hash Function
#define CAPACITY 50000 // Size of the HashTable.
unsigned long hash_function(char* str)
{
unsigned long i = 0;
for (int j = 0; str[j]; j++)
i += str[j];
return i % CAPACITY;
}
Run this code and test different strings for potential collisions. For example, the strings Hel
and Cau
will collide since they have the same ASCII value.
This code must return a number within the bounds of the capacity of the hash table. Otherwise, it may access an unbound memory location, leading to an error.
Defining the Hash Table Data Structures
A hash table is an array of items, which are { key: value }
pairs.
Item Structure
typedef struct Ht_item
{
char* key;
char* value;
} Ht_item;
Now, the hash table has an array of pointers that point to Ht_item
, so it is a double-pointer.
Hash Table Definition
typedef struct HashTable
{
// Contains an array of pointers to items.
Ht_item** items;
int size;
int count;
} HashTable;
Your hash table will need to return the number of elements in the hash table using count
and size of the hash table using size
.
Creating and Managing Hash Table
Creating Hash Table Items
Next, create functions for allocating memory and creating items. Create items by allocating memory for a key and value, and return a pointer to the item:
Create Item Function
Ht_item* create_item(char* key, char* value)
{
// Creates a pointer to a new HashTable item.
Ht_item* item = (Ht_item*) malloc(sizeof(Ht_item));
item->key = (char*) malloc(strlen(key) + 1);
item->value = (char*) malloc(strlen(value) + 1);
strcpy(item->key, key);
strcpy(item->value, value);
return item;
}
Continue following this pattern for additional functions like creating the table, freeing items, and more as described in the tutorial.
Creating the Hash Table
Create the table by allocating memory and setting size, count, and items:
Create Table Function
HashTable* create_table(int size)
{
// Creates a new HashTable.
HashTable* table = (HashTable*) malloc(sizeof(HashTable));
table->size = size;
table->count = 0;
table->items = (Ht_item**) calloc(table->size, sizeof(Ht_item*));
for (int i = 0; i < table->size; i++)
table->items[i] = NULL;
return table;
}
The preceding example allocates memory for the wrapper structure HashTable
and sets all the items to NULL
.
Freeing Memory
Freeing up memory is a C/C++ best practice. Free up memory that you’ve allocated on the heap with malloc()
and calloc()
.
Write functions that free up a table item and the whole table:
Free Item and Table Functions
void free_item(Ht_item* item)
{
// Frees an item.
free(item->key);
free(item->value);
free(item);
}
void free_table(HashTable* table)
{
// Frees the table.
for (int i = 0; i < table->size; i++)
{
Ht_item* item = table->items[i];
if (item != NULL)
free_item(item);
}
free(table->items);
free(table);
}
Displaying the Hash Table
Add a print_table()
to display the index, key, and value for each item:
Print Table Function
void print_table(HashTable* table)
{
printf("\nHash Table\n-------------------\n");
for (int i = 0; i < table->size; i++)
{
if (table->items[i])
{
printf("Index:%d, Key:%s, Value:%s\n", i, table->items[i]->key, table->items[i]->value);
}
}
printf("-------------------\n\n");
}
This concludes the basic functionality of your custom hash table. You will now write insert, search, and delete functions.
Inserting into the Hash Table
Create a function, ht_insert()
, that performs insertions. The function takes a HashTable
pointer, a key, and a value as parameters:
Insert Function
void ht_insert(HashTable* table, char* key, char* value)
{
// Creates the item.
Ht_item* item = create_item(key, value);
// Computes the index.
int index = hash_function(key);
Ht_item* current_item = table->items[index];
if (current_item == NULL)
{
// Key does not exist.
if (table->count == table->size)
{
// HashTable is full.
printf("Insert Error: Hash Table is full\n");
free_item(item);
return;
}
// Insert directly.
table->items[index] = item;
table->count++;
}
else
{
// Scenario 1: Update the value.
if (strcmp(current_item->key, key) == 0)
{
strcpy(table->items[index]->value, value);
return;
}
else
{
// Scenario 2: Handle the collision.
handle_collision(table, index, item);
return;
}
}
}
Continue implementing functions for searching and handling collisions.
Searching for Items in the Hash Table
Create a function, ht_search()
, that checks if the key exists and returns the corresponding value if it does. The function takes a HashTable
pointer and a key as parameters:
Search Function
char* ht_search(HashTable* table, char* key)
{
// Searches for the key in the HashTable.
// Returns NULL if it doesn't exist.
int index = hash_function(key);
Ht_item* item = table->items[index];
LinkedList* head = table->overflow_buckets[index];
// Provide only non-NULL values.
if (item != NULL)
{
if (strcmp(item->key, key) == 0)
return item->value;
if (head == NULL)
return NULL;
item = head->item;
head = head->next;
}
return NULL;
}
Print Search Function
Add a print_search()
to display the item that matches the key:
void print_search(HashTable* table, char* key)
{
char* val;
if ((val = ht_search(table, key)) == NULL)
{
printf("Key:%s does not exist\n", key);
return;
}
else {
printf("Key:%s, Value:%s\n", key, val);
}
}
Now, your ht_search()
function is complete.
Handling Collisions
There are different ways to resolve a collision. This tutorial will rely upon a method called Separate Chaining, which creates independent chains for all items that have the same hash index. The implementation in this tutorial will use linked lists.
Linked List Definition
Whenever there is a collision, additional items that collide on the same index are added to an overflow bucket list:
typedef struct LinkedList {
Ht_item* item;
struct LinkedList* next;
} LinkedList;
LinkedList* allocate_list()
{
// Allocates memory for a LinkedList pointer.
LinkedList* list = (LinkedList*) malloc(sizeof(LinkedList));
return list;
}
LinkedList* linkedlist_insert(LinkedList* list, Ht_item* item)
{
// Inserts the item onto the LinkedList.
if (!list)
{
LinkedList* head = allocate_list();
head->item = item;
head->next = NULL;
list = head;
return list;
}
else if (list->next == NULL)
{
LinkedList* node = allocate_list();
node->item = item;
node->next = NULL;
list->next = node;
return list;
}
LinkedList* temp = list;
while (temp->next->next)
{
temp = temp->next;
}
LinkedList* node = allocate_list();
node->item = item;
node->next = NULL;
temp->next = node;
return list;
}
Ht_item* linkedlist_remove(LinkedList* list)
{
// Removes the head from the LinkedList.
// Returns the item of the popped element.
if (!list)
return NULL;
if (!list->next)
return NULL;
LinkedList* node = list->next;
LinkedList* temp = list;
temp->next = NULL;
list = node;
Ht_item* it = NULL;
memcpy(temp->item, it, sizeof(Ht_item));
free(temp->item->key);
free(temp->item->value);
free(temp->item);
free(temp);
return it;
}
void free_linkedlist(LinkedList* list)
{
LinkedList* temp = list;
while (list)
{
temp = list;
list = list->next;
free(temp->item->key);
free(temp->item->value);
free(temp->item);
free(temp);
}
}
Integrating Overflow Buckets
Add overflow buckets to the HashTable
:
typedef struct HashTable HashTable;
// Defines the HashTable.
struct HashTable
{
// Contains an array of pointers to items.
Ht_item** items;
LinkedList** overflow_buckets;
int size;
int count;
};
Update create_table()
and free_table()
to include these buckets.
Creating and Freeing Overflow Buckets
LinkedList** create_overflow_buckets(HashTable* table)
{
// Create the overflow buckets; an array of LinkedLists.
LinkedList** buckets = (LinkedList**) calloc(table->size, sizeof(LinkedList*));
for (int i = 0; i < table->size; i++)
buckets[i] = NULL;
return buckets;
}
void free_overflow_buckets(HashTable* table)
{
// Free all the overflow bucket lists.
LinkedList** buckets = table->overflow_buckets;
for (int i = 0; i < table->size; i++)
free_linkedlist(buckets[i]);
free(buckets);
}
Now, collisions are handled effectively. Continue with deletion logic next.
Deleting from the Hash Table
Create a function, ht_delete()
, to remove an item from the hash table. The function takes a HashTable
pointer and a key as parameters:
Delete Function
void ht_delete(HashTable* table, char* key)
{
// Deletes an item from the table.
int index = hash_function(key);
Ht_item* item = table->items[index];
LinkedList* head = table->overflow_buckets[index];
if (item == NULL)
{
// Does not exist.
return;
}
else {
if (head == NULL && strcmp(item->key, key) == 0)
{
// Collision chain does not exist.
// Remove the item.
// Set table index to NULL.
table->items[index] = NULL;
free_item(item);
table->count--;
return;
}
else if (head != NULL)
{
// Collision chain exists.
if (strcmp(item->key, key) == 0)
{
// Remove this item.
// Set the head of the list as the new item.
free_item(item);
LinkedList* node = head;
head = head->next;
node->next = NULL;
table->items[index] = create_item(node->item->key, node->item->value);
free_linkedlist(node);
table->overflow_buckets[index] = head;
return;
}
LinkedList* curr = head;
LinkedList* prev = NULL;
while (curr)
{
if (strcmp(curr->item->key, key) == 0)
{
if (prev == NULL)
{
// First element of the chain.
// Remove the chain.
free_linkedlist(head);
table->overflow_buckets[index] = NULL;
return;
}
else
{
// This is somewhere in the chain.
prev->next = curr->next;
curr->next = NULL;
free_linkedlist(curr);
table->overflow_buckets[index] = head;
return;
}
}
curr = curr->next;
prev = curr;
}
}
}
}
Testing the Hash Table
Create a main()
function to test the hash table functionality. Insert, search, and delete operations will be validated.
Main Function
int main()
{
HashTable* ht = create_table(CAPACITY);
ht_insert(ht, (char*)"1", (char*)"First address");
ht_insert(ht, (char*)"2", (char*)"Second address");
ht_insert(ht, (char*)"Hel", (char*)"Third address");
ht_insert(ht, (char*)"Cau", (char*)"Fourth address");
print_search(ht, (char*)"1");
print_search(ht, (char*)"2");
print_search(ht, (char*)"3");
print_search(ht, (char*)"Hel");
print_search(ht, (char*)"Cau"); // Collision!
print_table(ht);
ht_delete(ht, (char*)"1");
ht_delete(ht, (char*)"Cau");
print_table(ht);
free_table(ht);
return 0;
}
Output
This code will produce the following output:
Key:1, Value:First address
Key:2, Value:Second address
Key:3 does not exist
Key:Hel, Value:Third address
Key:Cau does not exist
Hash Table
-------------------
Index:49, Key:1, Value:First address
Index:50, Key:2, Value:Second address
Index:281, Key:Hel, Value:Third address
-------------------
Hash Table
-------------------
Index:50, Key:2, Value:Second address
Index:281, Key:Hel, Value:Third address
-------------------
Conclusion
In this article, you implemented a hash table from scratch in C/C++. Experiment with other collision handling algorithms and different hash functions to expand your understanding and improve the hash table performance.