Implementation and Analysisof Open Addressed Hash Tables

An open-addressed hash table fundamentally consists of a single array. The structure OHTbl is the open-addressed hash table data structure (see Example 8.6). This structure consists of eight members: positions is the number of positions allocated in the hash table; vacated is a pointer that will be initialized to a special storage location to indicate that a particular position in the table has had an element removed from it; h1, h2, match, and destroy are members used to encapsulate the functions passed to ohtbl_init ; size is the number of elements currently in the table; and table is the array in which the elements are stored.

The vacated member requires a bit of discussion. Its purpose is to support the removal of elements. An unoccupied position in an open-addressed hash table usually contains a NULL pointer. However, when we remove an element, we cannot set its data pointer back to NULL because when probing to look up a subsequent element, NULL would indicate that the position is unoccupied and no more probes should be performed. In actuality, one or more elements may have been inserted by probing past the removed element while it was still in the table.

Considering this, we set the data pointer to the vacated member of the hash table data structure when we remove an element. The address of vacated serves as a special sentinel to indicate that a new element may be inserted at the position. This way, when probing to look up an element, we are assured that a NULL really means to stop probing.

Example 8.6. Header for the Open-Addressed Hash Table Abstract Datatype
/*****************************************************************************
*                                                                            *
*  ------------------------------- ohtbl.h --------------------------------  *
*                                                                            *
*****************************************************************************/

#ifndef OHTBL_H
#define OHTBL_H

#include <stdlib.h>

/*****************************************************************************
*                                                                            *
*  Define a structure for open-addressed hash tables.                        *
*                                                                            *
*****************************************************************************/

typedef struct OHTbl_ {

int                positions;
void               *vacated;

int                (*h1)(const void *key);
int                (*h2)(const void *key);
int                (*match)(const void *key1, const void *key2);
void               (*destroy)(void *data);

int                size;
void               **table;

} OHTbl;

/*****************************************************************************
*                                                                            *
*  --------------------------- Public Interface ---------------------------  *
*                                                                            *
*****************************************************************************/

int ohtbl_init(OHTbl *htbl, int positions, int (*h1)(const void *key), int
   (*h2)(const void *key), int (*match)(const void *key1, const void *key2),
   void (*destroy)(void *data));

void ohtbl_destroy(OHTbl *htbl);

int ohtbl_insert(OHTbl *htbl, const void *data);

int ohtbl_remove(OHTbl *htbl, void **data);

int ohtbl_lookup(const OHTbl *htbl, void **data);

#define ohtbl_size(htbl) ((htbl)->size)

#endif

ohtbl_init

The ohtbl_init operation initializes an open-addressed hash table so that it can be used in other operations (see Example 8.7). Initializing an open-addressed hash table is a simple operation in which we allocate space for the table; initialize the pointer in each position to NULL; encapsulate the h1, h2, match and destroy functions; initialize vacated to its sentinel address; and set the size member to 0.

The runtime complexity of ohtbl_init is O (m), where m is the number of positions in the table. This is because the data pointer in each of the m positions must be initialized to NULL, and all other parts of the operation run in a constant amount of time.

ohtbl_destroy

The ohtbl_destroy operation destroys an open-addressed hash table (see Example 8.7). Primarily this means freeing the memory ohtbl_init allocated for the table. The function passed as destroy to ohtbl_init is called once for each element as it is removed, provided destroy was not set to NULL.

The runtime complexity of ohtbl_destroy is O (m), where m is the number of positions in the hash table. This is because we must traverse all positions in the hash table to determine which are occupied. If destroy is NULL, ohtbl_destroy runs in O (1) time.

ohtbl_insert

The ohtbl_insert operation inserts an element into an open-addressed hash table (see Example 8.7). Since an open-addressed hash table has a fixed size, we first ensure that there is room for the new element to be inserted. Also, since a key is not allowed to be inserted into the hash table more than once, we call ohtbl_lookup to make sure the table does not already contain the new element.

Once these conditions are met, we use double hashing to probe the table for an unoccupied position. A position in the table is unoccupied if it points either to NULL or the address in vacated, a special member of the hash table data structure that indicates that a position has had an element removed from it. Once we find an unoccupied position in the table, we set the pointer at that position to point to the data we wish to insert. After this, we increment the table size.

Assuming we approximate uniform hashing well and the load factor of the hash table is relatively small, the runtime complexity of ohtbl_insert is O (1). This is because in order to find an unoccupied position at which to insert the element, we expect to probe 1/(1 - α) positions, a number treated as a small constant, where α is the load factor of the hash table.

ohtbl_remove

The ohtbl_remove operation removes an element from an open-addressed hash table (see Example 8.7). To remove the element, we use double hashing as in ohtbl_insert to locate the position at which the element resides. We continue searching until we locate the element or NULL is found. If we find the element, we set data to the data being removed and decrease the table size by 1. Also, we set the position in the table to the vacated member of the hash table data structure.

Assuming we approximate uniform hashing well, the runtime complexity of ohtbl_remove is O (1). This is because we expect to probe 1/(1 - α) positions, a number treated as a small constant, where α is the largest load factor of the hash table since calling ohtbl_init. The reason that the performance of this operation depends on the largest load factor and thus does not improve as elements are removed is that we must still probe past vacated positions. The use of the vacated member only improves the performance of ohtbl_insert.

ohtbl_lookup

The ohtbl_lookup operation searches for an element in an open-addressed hash table and returns a pointer to it (see Example 8.7). This operation works similarly to ohtbl_remove, except that the element is not removed from the table.

Assuming we approximate uniform hashing well, the runtime complexity of ohtbl_lookup is the same as ohtbl_remove, or O (1). This is because we expect to probe 1/(1 - α) positions, a number treated as a small constant, where α is the largest load factor of the hash table since calling ohtbl_init. The reason that performance depends on the largest load factor since calling ohtbl_init is the same as described for ohtbl_remove.

ohtbl_size

This macro evaluates to the number of elements in an open-addressed hash table (see Example 8.6). It works by accessing the size member of the OHTbl structure.

The runtime complexity of ohtbl_size is O (1) because accessing a member of a structure is a simple task that runs in a constant amount of time.

Example 8.7. Implementation of the Open-Addressed Hash Table Abstract Datatype
/*****************************************************************************
*                                                                            *
*  ------------------------------- ohtbl.c --------------------------------  *
*                                                                            *
*****************************************************************************/

#include <stdlib.h>
#include <string.h>

#include "ohtbl.h"

/*****************************************************************************
*                                                                            *
*  Reserve a sentinel memory address for vacated elements.                   *
*                                                                            *
*****************************************************************************/

static char        vacated;

/*****************************************************************************
*                                                                            *
*  ------------------------------ ohtbl_init ------------------------------  *
*                                                                            *
*****************************************************************************/

int ohtbl_init(OHTbl *htbl, int positions, int (*h1)(const void *key), int
   (*h2)(const void *key), int (*match)(const void *key1, const void *key2),
   void (*destroy)(void *data)) {

int                i;

/*****************************************************************************
*                                                                            *
*  Allocate space for the hash table.                                        *
*                                                                            *
*****************************************************************************/

if ((htbl->table = (void **)malloc(positions * sizeof(void *))) == NULL)
   return -1;

/*****************************************************************************
*                                                                            *
*  Initialize each position.                                                 *
*                                                                            *
*****************************************************************************/

htbl->positions = positions;

for (i = 0; i < htbl->positions; i++)
   htbl->table[i] = NULL;

/*****************************************************************************
*                                                                            *
*  Set the vacated member to the sentinel memory address reserved for this.  *
*                                                                            *
*****************************************************************************/

htbl->vacated = &vacated;

/*****************************************************************************
*                                                                            *
*  Encapsulate the functions.                                                *
*                                                                            *
*****************************************************************************/

htbl->h1 = h1;
htbl->h2 = h2;
htbl->match = match;
htbl->destroy = destroy;

/*****************************************************************************
*                                                                            *
*  Initialize the number of elements in the table.                           *
*                                                                            *
*****************************************************************************/

htbl->size = 0;

return 0;

}

/*****************************************************************************
*                                                                            *
*  ---------------------------- ohtbl_destroy -----------------------------  *
*                                                                            *
*****************************************************************************/

void ohtbl_destroy(OHTbl *htbl) {

int                i;
 
if (htbl->destroy != NULL) {

   /**************************************************************************
   *                                                                         *
   *  Call a user-defined function to free dynamically allocated data.       *
   *                                                                         *
   **************************************************************************/

   for (i = 0; i < htbl->positions; i++) {

      if (htbl->table[i] != NULL && htbl->table[i] != htbl->vacated)
         htbl->destroy(htbl->table[i]);

   }

}

/*****************************************************************************
*                                                                            *
*  Free the storage allocated for the hash table.                            *
*                                                                            *
*****************************************************************************/

free(htbl->table);

/*****************************************************************************
*                                                                            *
*  No operations are allowed now, but clear the structure as a precaution.   *
*                                                                            *
*****************************************************************************/

memset(htbl, 0, sizeof(OHTbl));

return;

}

/*****************************************************************************
*                                                                            *
*  ----------------------------- ohtbl_insert -----------------------------  *
*                                                                            *
*****************************************************************************/

int ohtbl_insert(OHTbl *htbl, const void *data) {

void               *temp;

int                position,
                   i;
 
/*****************************************************************************
*                                                                            *
*  Do not exceed the number of positions in the table.                       *
*                                                                            *
*****************************************************************************/

if (htbl->size == htbl->positions)
   return -1;

/*****************************************************************************
*                                                                            *
*  Do nothing if the data is already in the table.                           *
*                                                                            *
*****************************************************************************/

temp = (void *)data;

if (ohtbl_lookup(htbl, &temp) == 0)
   return 1;

/*****************************************************************************
*                                                                            *
*  Use double hashing to hash the key.                                       *
*                                                                            *
*****************************************************************************/

for (i = 0; i < htbl->positions; i++) {

   position = (htbl->h1(data) + (i * htbl->h2(data))) % htbl->positions;

   if (htbl->table[position] == NULL || htbl->table[position] == htbl->
      vacated) {

      /***********************************************************************
      *                                                                      *
      *  Insert the data into the table.                                     *
      *                                                                      *
      ***********************************************************************/

      htbl->table[position] = (void *)data;
      htbl->size++;
      return 0;

   }

}

/*****************************************************************************
*                                                                            *
*  Return that the hash functions were selected incorrectly.                 *
*                                                                            *
*****************************************************************************/

return -1;

}

/*****************************************************************************
*                                                                            *
*  ----------------------------- ohtbl_remove -----------------------------  *
*                                                                            *
*****************************************************************************/

int ohtbl_remove(OHTbl *htbl, void **data) {

int                position,
                   i;
 
/*****************************************************************************
*                                                                            *
*  Use double hashing to hash the key.                                       *
*                                                                            *
*****************************************************************************/

for (i = 0; i < htbl->positions; i++) {

   position = (htbl->h1(*data) + (i * htbl->h2(*data))) % htbl->positions;

   if (htbl->table[position] == NULL) {

      /***********************************************************************
      *                                                                      *
      *  Return that the data was not found.                                 *
      *                                                                      *
      ***********************************************************************/

      return -1;

      }

   else if (htbl->table[position] == htbl->vacated) {

      /***********************************************************************
      *                                                                      *
      *  Search beyond vacated positions.                                    *
      *                                                                      *
      ***********************************************************************/

      continue;

      }

   else if (htbl->match(htbl->table[position], *data)) {

      /***********************************************************************
      *                                                                      *
      *  Pass back the data from the table.                                  *
      *                                                                      *
      ***********************************************************************/

      *data = htbl->table[position];
      htbl->table[position] = htbl->vacated;
      htbl->size--;
      return 0;

   }

}

/*****************************************************************************
*                                                                            *
*  Return that the data was not found.                                       *
*                                                                            *
*****************************************************************************/

return -1;

}

/*****************************************************************************
*                                                                            *
*  ----------------------------- ohtbl_lookup -----------------------------  *
*                                                                            *
*****************************************************************************/

int ohtbl_lookup(const OHTbl *htbl, void **data) {

int                position,
                   i;
 
/*****************************************************************************
*                                                                            *
*  Use double hashing to hash the key.                                       *
*                                                                            *
*****************************************************************************/

for (i = 0; i < htbl->positions; i++) {

   position = (htbl->h1(*data) + (i * htbl->h2(*data))) % htbl->positions;

   if (htbl->table[position] == NULL) {

      /***********************************************************************
      *                                                                      *
      *  Return that the data was not found.                                 *
      *                                                                      *
      ***********************************************************************/

      return -1;

      }

   else if (htbl->match(htbl->table[position], *data)) {

      /***********************************************************************
      *                                                                      *
      *  Pass back the data from the table.                                  *
      *                                                                      *
      ***********************************************************************/

      *data = htbl->table[position];
      return 0;

   }

}

/*****************************************************************************
*                                                                            *
*  Return that the data was not found.                                       *
*                                                                            *
*****************************************************************************/

return -1;

}
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset