Content

Vijona

25. December 2024

Hash Table Implementation in C/C++

A hash table in C/C++ is a data structure that maps keys to values. A hash table uses a hash function to compute indexes for a key. You can store the value at the appropriate location based on the hash table index.

Benefits of Hash Tables

The benefit of using a hash table is its very fast access time. Typically, the time complexity (amortized time complexity) is a constant O(1) access time.

If two different keys get the same index, you will need to use other data structures (buckets) to account for these collisions. If you choose a very good hash function, the likelihood of a collision can be negligible.

The C++ STL (Standard Template Library) has the std::unordered_map() data structure.

Constructing a Hash Table

In this article, you will construct a hash table from scratch comprised of:

A hash function to map keys to values.
A hash table data structure that supports insert, search, and delete operations.
A data structure to account for a collision of keys.

Choosing a Hash Function

The first step is to choose a reasonably good hash function that has a low chance of collision. However, for the purposes of this tutorial, a poor hash function will be applied to better illustrate hash collisions. This limited example will also only utilize strings (or character arrays in C).

Example: Hash Function

Copy Code


 
#define CAPACITY 50000 // Size of the HashTable.

unsigned long hash_function(char* str)
{
    unsigned long i = 0;

    for (int j = 0; str[j]; j++)
        i += str[j];

    return i % CAPACITY;
}

Run this code and test different strings for potential collisions. For example, the strings Hel and Cau will collide since they have the same ASCII value.

This code must return a number within the bounds of the capacity of the hash table. Otherwise, it may access an unbound memory location, leading to an error.

Defining the Hash Table Data Structures

A hash table is an array of items, which are { key: value } pairs.

Item Structure

Copy Code


 
typedef struct Ht_item
{
    char* key;
    char* value;
} Ht_item;

Now, the hash table has an array of pointers that point to Ht_item, so it is a double-pointer.

Hash Table Definition

Copy Code


 
typedef struct HashTable
{
    // Contains an array of pointers to items.
    Ht_item** items;
    int size;
    int count;
} HashTable;

Your hash table will need to return the number of elements in the hash table using count and size of the hash table using size.

Creating and Managing Hash Table

Creating Hash Table Items

Next, create functions for allocating memory and creating items. Create items by allocating memory for a key and value, and return a pointer to the item:

Create Item Function

Copy Code


 
Ht_item* create_item(char* key, char* value)
{
    // Creates a pointer to a new HashTable item.
    Ht_item* item = (Ht_item*) malloc(sizeof(Ht_item));
    item->key = (char*) malloc(strlen(key) + 1);
    item->value = (char*) malloc(strlen(value) + 1);
    strcpy(item->key, key);
    strcpy(item->value, value);
    return item;
}

Continue following this pattern for additional functions like creating the table, freeing items, and more as described in the tutorial.

Creating the Hash Table

Create the table by allocating memory and setting size, count, and items:

Create Table Function

Copy Code


 
HashTable* create_table(int size)
{
    // Creates a new HashTable.
    HashTable* table = (HashTable*) malloc(sizeof(HashTable));
    table->size = size;
    table->count = 0;
    table->items = (Ht_item**) calloc(table->size, sizeof(Ht_item*));

    for (int i = 0; i < table->size; i++)
        table->items[i] = NULL;

    return table;
}

The preceding example allocates memory for the wrapper structure HashTable and sets all the items to NULL.

Freeing Memory

Freeing up memory is a C/C++ best practice. Free up memory that you’ve allocated on the heap with malloc() and calloc().

Write functions that free up a table item and the whole table:

Free Item and Table Functions

Copy Code


 
void free_item(Ht_item* item)
{
    // Frees an item.
    free(item->key);
    free(item->value);
    free(item);
}

void free_table(HashTable* table)
{
    // Frees the table.
    for (int i = 0; i < table->size; i++)
    {
        Ht_item* item = table->items[i];

        if (item != NULL)
            free_item(item);
    }

    free(table->items);
    free(table);
}

Displaying the Hash Table

Add a print_table() to display the index, key, and value for each item:

Print Table Function

Copy Code


 
void print_table(HashTable* table)
{
    printf("\nHash Table\n-------------------\n");

    for (int i = 0; i < table->size; i++)
    {
        if (table->items[i])
        {
            printf("Index:%d, Key:%s, Value:%s\n", i, table->items[i]->key, table->items[i]->value);
        }
    }

    printf("-------------------\n\n");
}

This concludes the basic functionality of your custom hash table. You will now write insert, search, and delete functions.

Inserting into the Hash Table

Create a function, ht_insert(), that performs insertions. The function takes a HashTable pointer, a key, and a value as parameters:

Insert Function

Copy Code


 
void ht_insert(HashTable* table, char* key, char* value)
{
    // Creates the item.
    Ht_item* item = create_item(key, value);

    // Computes the index.
    int index = hash_function(key);

    Ht_item* current_item = table->items[index];

    if (current_item == NULL)
    {
        // Key does not exist.
        if (table->count == table->size)
        {
            // HashTable is full.
            printf("Insert Error: Hash Table is full\n");
            free_item(item);
            return;
        }

        // Insert directly.
        table->items[index] = item;
        table->count++;
    }
    else
    {
        // Scenario 1: Update the value.
        if (strcmp(current_item->key, key) == 0)
        {
            strcpy(table->items[index]->value, value);
            return;
        }
        else
        {
            // Scenario 2: Handle the collision.
            handle_collision(table, index, item);
            return;
        }
    }
}

Continue implementing functions for searching and handling collisions.

Searching for Items in the Hash Table

Create a function, ht_search(), that checks if the key exists and returns the corresponding value if it does. The function takes a HashTable pointer and a key as parameters:

Search Function

Copy Code


 
char* ht_search(HashTable* table, char* key)
{
    // Searches for the key in the HashTable.
    // Returns NULL if it doesn't exist.
    int index = hash_function(key);
    Ht_item* item = table->items[index];
    LinkedList* head = table->overflow_buckets[index];

    // Provide only non-NULL values.
    if (item != NULL)
    {
        if (strcmp(item->key, key) == 0)
            return item->value;

        if (head == NULL)
            return NULL;

        item = head->item;
        head = head->next;
    }

    return NULL;
}

Print Search Function

Add a print_search() to display the item that matches the key:

Copy Code


 
void print_search(HashTable* table, char* key)
{
    char* val;

    if ((val = ht_search(table, key)) == NULL)
    {
        printf("Key:%s does not exist\n", key);
        return;
    }
    else {
        printf("Key:%s, Value:%s\n", key, val);
    }
}

Now, your ht_search() function is complete.

Handling Collisions

There are different ways to resolve a collision. This tutorial will rely upon a method called Separate Chaining, which creates independent chains for all items that have the same hash index. The implementation in this tutorial will use linked lists.

Linked List Definition

Whenever there is a collision, additional items that collide on the same index are added to an overflow bucket list:

Copy Code


 
typedef struct LinkedList {
    Ht_item* item;
    struct LinkedList* next;
} LinkedList;

LinkedList* allocate_list()
{
    // Allocates memory for a LinkedList pointer.
    LinkedList* list = (LinkedList*) malloc(sizeof(LinkedList));
    return list;
}

LinkedList* linkedlist_insert(LinkedList* list, Ht_item* item)
{
    // Inserts the item onto the LinkedList.
    if (!list)
    {
        LinkedList* head = allocate_list();
        head->item = item;
        head->next = NULL;
        list = head;
        return list;
    }
    else if (list->next == NULL)
    {
        LinkedList* node = allocate_list();
        node->item = item;
        node->next = NULL;
        list->next = node;
        return list;
    }

    LinkedList* temp = list;

    while (temp->next->next)
    {
        temp = temp->next;
    }

    LinkedList* node = allocate_list();
    node->item = item;
    node->next = NULL;
    temp->next = node;
    return list;
}

Ht_item* linkedlist_remove(LinkedList* list)
{
    // Removes the head from the LinkedList.
    // Returns the item of the popped element.
    if (!list)
        return NULL;

    if (!list->next)
        return NULL;

    LinkedList* node = list->next;
    LinkedList* temp = list;
    temp->next = NULL;
    list = node;
    Ht_item* it = NULL;
    memcpy(temp->item, it, sizeof(Ht_item));
    free(temp->item->key);
    free(temp->item->value);
    free(temp->item);
    free(temp);
    return it;
}

void free_linkedlist(LinkedList* list)
{
    LinkedList* temp = list;

    while (list)
    {
        temp = list;
        list = list->next;
        free(temp->item->key);
        free(temp->item->value);
        free(temp->item);
        free(temp);
    }
}

Integrating Overflow Buckets

Add overflow buckets to the HashTable:

Copy Code


 
typedef struct HashTable HashTable;

// Defines the HashTable.
struct HashTable
{
    // Contains an array of pointers to items.
    Ht_item** items;
    LinkedList** overflow_buckets;
    int size;
    int count;
};

Update create_table() and free_table() to include these buckets.

Creating and Freeing Overflow Buckets

Copy Code


 
LinkedList** create_overflow_buckets(HashTable* table)
{
    // Create the overflow buckets; an array of LinkedLists.
    LinkedList** buckets = (LinkedList**) calloc(table->size, sizeof(LinkedList*));

    for (int i = 0; i < table->size; i++)
        buckets[i] = NULL;

    return buckets;
}

void free_overflow_buckets(HashTable* table)
{
    // Free all the overflow bucket lists.
    LinkedList** buckets = table->overflow_buckets;

    for (int i = 0; i < table->size; i++)
        free_linkedlist(buckets[i]);

    free(buckets);
}

Now, collisions are handled effectively. Continue with deletion logic next.

Deleting from the Hash Table

Create a function, ht_delete(), to remove an item from the hash table. The function takes a HashTable pointer and a key as parameters:

Delete Function

Copy Code


 
void ht_delete(HashTable* table, char* key)
{
    // Deletes an item from the table.
    int index = hash_function(key);
    Ht_item* item = table->items[index];
    LinkedList* head = table->overflow_buckets[index];

    if (item == NULL)
    {
        // Does not exist.
        return;
    }
    else {
        if (head == NULL && strcmp(item->key, key) == 0)
        {
            // Collision chain does not exist.
            // Remove the item.
            // Set table index to NULL.
            table->items[index] = NULL;
            free_item(item);
            table->count--;
            return;
        }
        else if (head != NULL)
        {
            // Collision chain exists.
            if (strcmp(item->key, key) == 0)
            {
                // Remove this item.
                // Set the head of the list as the new item.
                free_item(item);
                LinkedList* node = head;
                head = head->next;
                node->next = NULL;
                table->items[index] = create_item(node->item->key, node->item->value);
                free_linkedlist(node);
                table->overflow_buckets[index] = head;
                return;
            }

            LinkedList* curr = head;
            LinkedList* prev = NULL;

            while (curr)
            {
                if (strcmp(curr->item->key, key) == 0)
                {
                    if (prev == NULL)
                    {
                        // First element of the chain.
                        // Remove the chain.
                        free_linkedlist(head);
                        table->overflow_buckets[index] = NULL;
                        return;
                    }
                    else
                    {
                        // This is somewhere in the chain.
                        prev->next = curr->next;
                        curr->next = NULL;
                        free_linkedlist(curr);
                        table->overflow_buckets[index] = head;
                        return;
                    }
                }

                curr = curr->next;
                prev = curr;
            }
        }
    }
}

Testing the Hash Table

Create a main() function to test the hash table functionality. Insert, search, and delete operations will be validated.

Main Function

Copy Code


 
int main()
{
    HashTable* ht = create_table(CAPACITY);
    ht_insert(ht, (char*)"1", (char*)"First address");
    ht_insert(ht, (char*)"2", (char*)"Second address");
    ht_insert(ht, (char*)"Hel", (char*)"Third address");
    ht_insert(ht, (char*)"Cau", (char*)"Fourth address");
    print_search(ht, (char*)"1");
    print_search(ht, (char*)"2");
    print_search(ht, (char*)"3");
    print_search(ht, (char*)"Hel");
    print_search(ht, (char*)"Cau"); // Collision!
    print_table(ht);
    ht_delete(ht, (char*)"1");
    ht_delete(ht, (char*)"Cau");
    print_table(ht);
    free_table(ht);
    return 0;
}

Output

This code will produce the following output:

Copy Code

Key:1, Value:First address Key:2, Value:Second address Key:3 does not exist Key:Hel, Value:Third address Key:Cau does not exist Hash Table ------------------- Index:49, Key:1, Value:First address Index:50, Key:2, Value:Second address Index:281, Key:Hel, Value:Third address ------------------- Hash Table ------------------- Index:50, Key:2, Value:Second address Index:281, Key:Hel, Value:Third address -------------------

Conclusion

In this article, you implemented a hash table from scratch in C/C++. Experiment with other collision handling algorithms and different hash functions to expand your understanding and improve the hash table performance.

Create a Free Account

Try now

Posts you might be interested in:

Deep Learning Architectures: ResNet, InceptionV3 & SqueezeNet

Security, Tutorial

2 weeks ago

Models That Shaped Deep Learning: 2015-2016 In this article we’ll dive deep into the models that came after this period, in 2015-2016. Big advancements were made in just these two…

Training, Validation and Accuracy in PyTorch: A Beginner’s Guide

Python, Tutorial

2 weeks ago

Deep Learning with PyTorch: Training, Validation, and Accuracy Content1 Introduction2 Prerequisites3 Imports and Setup4 Anatomy of Neural Networks5 Model Objective6 The Right Combination7 Proper Generalization8 Measuring Performance9 Dataset10 Convnet Architecture11…

Future Trends in GPU Technology: AI, Quantum Computing, and More

Linux Basics, Tutorial

2 weeks ago

Future Trends in GPU Technology Over the past few decades, GPUs have shifted from special-purpose hardware for gaming to becoming the preferred tools for a wide range of applications. This…

FEATURED PRODUCTS

Kubernetes

ccloud³

Managed Server

Cloud GPU

S3 Object Storage

COMPUTE

MANAGED

STORAGE

NETWORKING

MANAGEMENT TOOLS

BACKUPS & SNAPSHOTS

WEBSITE HOSTING

HOUSING

FEATURED INDUSTRIES

Enterprise

Saas-Hosting

Startup

INDUSTRIES

MORE INDUSTRIES

FEATURED USE CASES

Linux-Hosting

VMware Migration

Docker Hosting

USE CASES

MORE USE CASES

RESSOURCES

Help Center

Trust Center

Glossar

Tutorials

MORE CENTRON

MORE INFOS

FEATURED PRODUCTS

Kubernetes

ccloud³

Managed Server

Cloud GPU

S3 Object Storage

COMPUTE

MANAGED

STORAGE

NETWORKING

MANAGEMENT TOOLS

BACKUPS & SNAPSHOTS

WEBSITE HOSTING

HOUSING

FEATURED INDUSTRIES

Enterprise

Saas-Hosting

Startup

INDUSTRIES

MORE INDUSTRIES

FEATURED USE CASES

Linux-Hosting

VMware Migration

Docker Hosting

USE CASES

MORE USE CASES

RESSOURCES

Help Center

Trust Center

Glossar

Tutorials

MORE CENTRON

MORE INFOS

Hash Table Implementation in C/C++

Benefits of Hash Tables

Constructing a Hash Table

Choosing a Hash Function

Example: Hash Function

Defining the Hash Table Data Structures

Item Structure

Hash Table Definition

Creating and Managing Hash Table

Creating Hash Table Items

Create Item Function

Creating the Hash Table

Create Table Function

Freeing Memory