An open-addressed hash table fundamentally consists of a
single array. The structure OHTbl
is the open-addressed
hash table data structure (see Example
8.6). This structure consists of eight members:
positions
is the number of positions
allocated in the hash table; vacated
is a
pointer that will be initialized to a special storage location to
indicate that a particular position in the table has had an element
removed from it; h1
,
h2
, match
, and
destroy
are members used to encapsulate the
functions passed to ohtbl_init ;
size
is the number of elements currently in
the table; and table
is the array in which
the elements are stored.
The vacated
member requires a bit of
discussion. Its purpose is to support the removal of elements. An
unoccupied position in an open-addressed hash table usually contains a
NULL pointer. However, when we remove an element, we cannot set its
data pointer back to NULL because when probing to look up a subsequent element, NULL would
indicate that the position is unoccupied and no more probes should be
performed. In actuality, one or more elements may have been inserted
by probing past the removed element while it was still in the
table.
Considering this, we set the data pointer to the
vacated
member of the hash table data
structure when we remove an element. The address of
vacated
serves as a special sentinel to
indicate that a new element may be inserted at the position. This way,
when probing to look up an element, we are assured that a NULL really
means to stop probing.
/***************************************************************************** * * * ------------------------------- ohtbl.h -------------------------------- * * * *****************************************************************************/ #ifndef OHTBL_H #define OHTBL_H #include <stdlib.h> /***************************************************************************** * * * Define a structure for open-addressed hash tables. * * * *****************************************************************************/ typedef struct OHTbl_ { int positions; void *vacated; int (*h1)(const void *key); int (*h2)(const void *key); int (*match)(const void *key1, const void *key2); void (*destroy)(void *data); int size; void **table; } OHTbl; /***************************************************************************** * * * --------------------------- Public Interface --------------------------- * * * *****************************************************************************/ int ohtbl_init(OHTbl *htbl, int positions, int (*h1)(const void *key), int (*h2)(const void *key), int (*match)(const void *key1, const void *key2), void (*destroy)(void *data)); void ohtbl_destroy(OHTbl *htbl); int ohtbl_insert(OHTbl *htbl, const void *data); int ohtbl_remove(OHTbl *htbl, void **data); int ohtbl_lookup(const OHTbl *htbl, void **data); #define ohtbl_size(htbl) ((htbl)->size) #endif
The ohtbl_init operation
initializes an open-addressed hash table so that it can be used in
other operations (see Example
8.7). Initializing an open-addressed hash table is a simple
operation in which we allocate space for the table; initialize the
pointer in each position to NULL; encapsulate the
h1
, h2
,
match
and
destroy
functions; initialize
vacated
to its sentinel address; and set
the size
member to 0.
The runtime complexity of ohtbl_init is O (m), where m is the number of positions in the table. This is because the data pointer in each of the m positions must be initialized to NULL, and all other parts of the operation run in a constant amount of time.
The ohtbl_destroy operation
destroys an open-addressed hash table (see Example 8.7). Primarily this means
freeing the memory ohtbl_init allocated for the
table. The function passed as destroy
to
ohtbl_init is called once for each element as
it is removed, provided destroy
was not
set to NULL.
The runtime complexity of ohtbl_destroy
is O (m), where
m is the number of positions in the hash table.
This is because we must traverse all positions in the hash table to
determine which are occupied. If destroy
is NULL, ohtbl_destroy runs in
O (1) time.
The ohtbl_insert operation inserts an element into an open-addressed hash table (see Example 8.7). Since an open-addressed hash table has a fixed size, we first ensure that there is room for the new element to be inserted. Also, since a key is not allowed to be inserted into the hash table more than once, we call ohtbl_lookup to make sure the table does not already contain the new element.
Once these conditions are met, we use double hashing to probe
the table for an unoccupied position. A position in the table is
unoccupied if it points either to NULL or the address in
vacated
, a special member of the hash table data structure that
indicates that a position has had an element removed from it. Once
we find an unoccupied position in the table, we set the pointer at
that position to point to the data we wish to insert. After this, we
increment the table size.
Assuming we approximate uniform hashing well and the load factor of the hash table is relatively small, the runtime complexity of ohtbl_insert is O (1). This is because in order to find an unoccupied position at which to insert the element, we expect to probe 1/(1 - α) positions, a number treated as a small constant, where α is the load factor of the hash table.
The ohtbl_remove operation
removes an element from an open-addressed hash table (see Example 8.7). To remove the element, we
use double hashing as in ohtbl_insert to locate
the position at which the element resides. We continue searching
until we locate the element or NULL is found. If we find the
element, we set data
to the data being
removed and decrease the table size by 1. Also, we set the position
in the table to the vacated
member of the
hash table data structure.
Assuming we approximate uniform hashing well, the runtime
complexity of ohtbl_remove is
O (1). This is because we expect to probe 1/(1
- α) positions, a number
treated as a small constant, where α is the largest load factor of the hash
table since calling ohtbl_init. The reason that
the performance of this operation depends on the largest load factor
and thus does not improve as elements are removed is that we must
still probe past vacated positions. The use of the
vacated
member only improves the
performance of ohtbl_insert.
The ohtbl_lookup operation searches for an element in an open-addressed hash table and returns a pointer to it (see Example 8.7). This operation works similarly to ohtbl_remove, except that the element is not removed from the table.
Assuming we approximate uniform hashing well, the runtime complexity of ohtbl_lookup is the same as ohtbl_remove, or O (1). This is because we expect to probe 1/(1 - α) positions, a number treated as a small constant, where α is the largest load factor of the hash table since calling ohtbl_init. The reason that performance depends on the largest load factor since calling ohtbl_init is the same as described for ohtbl_remove.
This macro evaluates to the number of elements in an
open-addressed hash table (see Example
8.6). It works by accessing the
size
member of the
OHTbl
structure.
The runtime complexity of ohtbl_size is O (1) because accessing a member of a structure is a simple task that runs in a constant amount of time.
/***************************************************************************** * * * ------------------------------- ohtbl.c -------------------------------- * * * *****************************************************************************/ #include <stdlib.h> #include <string.h> #include "ohtbl.h" /***************************************************************************** * * * Reserve a sentinel memory address for vacated elements. * * * *****************************************************************************/ static char vacated; /***************************************************************************** * * * ------------------------------ ohtbl_init ------------------------------ * * * *****************************************************************************/ int ohtbl_init(OHTbl *htbl, int positions, int (*h1)(const void *key), int (*h2)(const void *key), int (*match)(const void *key1, const void *key2), void (*destroy)(void *data)) { int i; /***************************************************************************** * * * Allocate space for the hash table. * * * *****************************************************************************/ if ((htbl->table = (void **)malloc(positions * sizeof(void *))) == NULL) return -1; /***************************************************************************** * * * Initialize each position. * * * *****************************************************************************/ htbl->positions = positions; for (i = 0; i < htbl->positions; i++) htbl->table[i] = NULL; /***************************************************************************** * * * Set the vacated member to the sentinel memory address reserved for this. * * * *****************************************************************************/ htbl->vacated = &vacated; /***************************************************************************** * * * Encapsulate the functions. * * * *****************************************************************************/ htbl->h1 = h1; htbl->h2 = h2; htbl->match = match; htbl->destroy = destroy; /***************************************************************************** * * * Initialize the number of elements in the table. * * * *****************************************************************************/ htbl->size = 0; return 0; } /***************************************************************************** * * * ---------------------------- ohtbl_destroy ----------------------------- * * * *****************************************************************************/ void ohtbl_destroy(OHTbl *htbl) { int i; if (htbl->destroy != NULL) { /************************************************************************** * * * Call a user-defined function to free dynamically allocated data. * * * **************************************************************************/ for (i = 0; i < htbl->positions; i++) { if (htbl->table[i] != NULL && htbl->table[i] != htbl->vacated) htbl->destroy(htbl->table[i]); } } /***************************************************************************** * * * Free the storage allocated for the hash table. * * * *****************************************************************************/ free(htbl->table); /***************************************************************************** * * * No operations are allowed now, but clear the structure as a precaution. * * * *****************************************************************************/ memset(htbl, 0, sizeof(OHTbl)); return; } /***************************************************************************** * * * ----------------------------- ohtbl_insert ----------------------------- * * * *****************************************************************************/ int ohtbl_insert(OHTbl *htbl, const void *data) { void *temp; int position, i; /***************************************************************************** * * * Do not exceed the number of positions in the table. * * * *****************************************************************************/ if (htbl->size == htbl->positions) return -1; /***************************************************************************** * * * Do nothing if the data is already in the table. * * * *****************************************************************************/ temp = (void *)data; if (ohtbl_lookup(htbl, &temp) == 0) return 1; /***************************************************************************** * * * Use double hashing to hash the key. * * * *****************************************************************************/ for (i = 0; i < htbl->positions; i++) { position = (htbl->h1(data) + (i * htbl->h2(data))) % htbl->positions; if (htbl->table[position] == NULL || htbl->table[position] == htbl-> vacated) { /*********************************************************************** * * * Insert the data into the table. * * * ***********************************************************************/ htbl->table[position] = (void *)data; htbl->size++; return 0; } } /***************************************************************************** * * * Return that the hash functions were selected incorrectly. * * * *****************************************************************************/ return -1; } /***************************************************************************** * * * ----------------------------- ohtbl_remove ----------------------------- * * * *****************************************************************************/ int ohtbl_remove(OHTbl *htbl, void **data) { int position, i; /***************************************************************************** * * * Use double hashing to hash the key. * * * *****************************************************************************/ for (i = 0; i < htbl->positions; i++) { position = (htbl->h1(*data) + (i * htbl->h2(*data))) % htbl->positions; if (htbl->table[position] == NULL) { /*********************************************************************** * * * Return that the data was not found. * * * ***********************************************************************/ return -1; } else if (htbl->table[position] == htbl->vacated) { /*********************************************************************** * * * Search beyond vacated positions. * * * ***********************************************************************/ continue; } else if (htbl->match(htbl->table[position], *data)) { /*********************************************************************** * * * Pass back the data from the table. * * * ***********************************************************************/ *data = htbl->table[position]; htbl->table[position] = htbl->vacated; htbl->size--; return 0; } } /***************************************************************************** * * * Return that the data was not found. * * * *****************************************************************************/ return -1; } /***************************************************************************** * * * ----------------------------- ohtbl_lookup ----------------------------- * * * *****************************************************************************/ int ohtbl_lookup(const OHTbl *htbl, void **data) { int position, i; /***************************************************************************** * * * Use double hashing to hash the key. * * * *****************************************************************************/ for (i = 0; i < htbl->positions; i++) { position = (htbl->h1(*data) + (i * htbl->h2(*data))) % htbl->positions; if (htbl->table[position] == NULL) { /*********************************************************************** * * * Return that the data was not found. * * * ***********************************************************************/ return -1; } else if (htbl->match(htbl->table[position], *data)) { /*********************************************************************** * * * Pass back the data from the table. * * * ***********************************************************************/ *data = htbl->table[position]; return 0; } } /***************************************************************************** * * * Return that the data was not found. * * * *****************************************************************************/ return -1; }