The structure Set
is the set
data structure. A good way to implement a set is as a linked
list. A simple way to do this is to typedef
Set
to List
(see
Example 7.1). In addition to
simplicity, using a typedef has the benefit of making the set somewhat
polymorphic, just as was described for stacks and queues (see Chapter 6). Thus, because the set is a
linked list, we can use linked list operations on it when we want it
to act like one. The biggest benefit of this with sets is that we can
use list_next to traverse a set, and
list_rem_next to remove members without having to
identify them by the data they store. Recall that
set_remove only removes members keyed by their
data, which can be a problem when we do not know the members a set
contains.
In general, the set operations presented here are somewhat costly, primarily because many of them search for members of one set in another by traversing each member. However, we can improve the running times of these operations by using a more efficient searching technique, such as hashing (see Chapter 8). Nevertheless, the implementation provided here is a general-purpose approach whose performance is adequate for small to medium-sized sets of data.
/***************************************************************************** * * * -------------------------------- set.h --------------------------------- * * * *****************************************************************************/ #ifndef SET_H #define SET_H #include <stdlib.h> #include "list.h" /***************************************************************************** * * * Implement sets as linked lists. * * * *****************************************************************************/ typedef List Set; /***************************************************************************** * * * --------------------------- Public Interface --------------------------- * * * *****************************************************************************/ void set_init(Set *set, int (*match)(const void *key1, const void *key2), void (*destroy)(void *data)); #define set_destroy list_destroy int set_insert(Set *set, const void *data); int set_remove(Set *set, void **data); int set_union(Set *setu, const Set *set1, const Set *set2); int set_intersection(Set *seti, const Set *set1, const Set *set2); int set_difference(Set *setd, const Set *set1, const Set *set2); int set_is_member(const Set *set, const void *data); int set_is_subset(const Set *set1, const Set *set2); int set_is_equal(const Set *set1, const Set *set2); #define set_size(set) ((set)->size) #endif
The set_init operation
initializes a set so that it can be used in other operations (see
Example 7.2). Since a set is a
linked list, list_init is called to initialize
it. The match
member is set to
match
by hand because this member is not
used by linked lists and is therefore not set by
list_init.
The runtime complexity of set_init is the same as list_init, or O (1).
The set_destroy operation destroys a set (see Example 7.1). Since a set is a linked list and requires being destroyed in the same manner, set_destroy is defined to list_destroy.
The runtime complexity of set_destroy is the same as list_destroy, or O (n), where n is the number of members in the set.
The set_insert operation inserts a member into a set (see Example 7.2). Since a member must not occur more than once in a set, set_is_member is called to make sure that the set does not already contain the new member. As long as the member does not already exist in the set, list_ins_next is called to insert the member.
The runtime complexity of set_insert is O (n) because set_is_member runs in O (n) time, and list_ins_next runs in O (1).
The set_remove operation removes
a member from a set by traversing it using
list_next until
match
determines that the member to be
removed has been found (see Example
7.2). The pointer prev
points just
before the member to be removed since this is required by
list_rem_next. The
list_rem_next operation sets
data
to point to the data from the member
removed.
The runtime complexity of set_remove is O (n), where n is the number of elements in the set. This is because, in the worst case, the entire set must be traversed in order to find the member to be removed. This results in n times O (1), the cost of the statements within the loop, for a running time of O (n) overall. Once the member is found, list_rem_next removes it in O (1) time.
The set_union operation builds a
set, setu
, which is the union of the sets
set1
and set2
(see Example 7.2). First,
setu
is initialized by calling
set_init. Next, the members of
set1
are inserted into
setu
by calling
list_ins_next repeatedly for each member of
set1
. Finally, the members of
set2
are inserted into
setu
in a similar manner except that
set_is_member is called before each insertion
to ensure that no members are duplicated in
setu
.
The runtime complexity of set_union is
O (mn), where
m is the size of
set1
and n is the
size of set2
. In the first loop, each
member of set1
is traversed and inserted
into setu
, which results in a running
time of O (m). In the
second loop, each element of set2
is
traversed, which results in n times the cost of
the statements within this loop. This loop contains the
O (m) operation
set_is_member. Therefore, the overall
complexity of the loop is O
(mn). Since the two loops are executed one
after another, the complexity of set_union is
the more expensive of the two, or O
(mn).
The set_intersection operation
builds a set, seti
, which is the
intersection of the sets set1
and
set2
(see Example 7.2). First,
seti
is initialized by calling
set_init. Next, for each member of
set1
, set_is_member
is called to determine whether the member is in
set2
. If so, the member is inserted into
seti
.
The runtime complexity of
set_intersection is O
(mn), where m is the size
of set1
and n is the
size of set2
. This is because for each
member in set1
, the
O (n) operation
set_is_member is called to determine whether
the member is in set2
.
The set_difference operation
builds a set, setd
, which is the
difference of the sets set1
and
set2
(see Example 7.2). First,
setd
is initialized by calling
set_init. Next, for each member of
set1
, set_is_member
is called to determine whether the member is in
set2
. If not, the member is inserted into
setd
.
The runtime complexity of set_difference
is O (mn), where
m is the size of
set1
and n is the
size of set2
. This is because for each
member in set1
, the
O (n) operation
set_is_member is called to determine whether
the member is in set2
.
The set_is_member operation
determines whether a particular member exists in a set (see Example 7.2). This is accomplished by
traversing the set using list_next until either
a member matching data
is found or all
members are traversed.
The runtime complexity of set_is_member is O (n), where n is the number of members in the set. This is because, in the worst case, the entire set must be traversed to find the member for which we are searching.
The set_is_subset operation
determines whether one set, set1
, is a
subset of another set, set2
(see Example 7.2). Since a set that is a
subset of another must be the same size or smaller, we begin by
comparing sizes. If this test fails, then
set1
is not a subset of
set2
. Otherwise,
set1
is traversed using
list_next until either a member of
set1
that is not in
set2
is found or all members are
traversed. If we find a member of set1
not in set2
, then
set1
is not a subset of
set2
. If we end up traversing all members
of set1
, then
set1
is a subset of
set2
.
The runtime complexity of set_is_subset
is O (mn), where
m is the size of
set1
and n is the
size of set2
. This is because for each
member in set1
, the
O (n) operation
set_is_member is called to determine whether
the member is in set2
.
The set_is_equal operation
determines whether one set, set1
, is
equal to another set, set2
(see Example 7.2). Since two sets that are
equal must be the same size, we begin by comparing sizes. If the two
sets are not the same size, then they are not equal. If the two sets
are the same size, we need only return the result of whether
set1
is a subset of
set2
. This is determined by calling
set_is_subset.
The runtime complexity of set_is_equal is
O (mn), where
m is the size of
set1
and n is the
size of set2
. This is because
set_is_subset runs in O
(mn) time.
This macro evaluates to the size of a set (see Example 7.1). It works by accessing the
size
member of the
Set
structure.
The runtime complexity of set_size is O (1) because accessing a member of a structure is a simple task that runs in a constant amount of time.
/***************************************************************************** * * * -------------------------------- set.c --------------------------------- * * * *****************************************************************************/ #include <stdlib.h> #include <string.h> #include "list.h" #include "set.h" /***************************************************************************** * * * ------------------------------- set_init ------------------------------- * * * *****************************************************************************/ void set_init(Set *set, int (*match)(const void *key1, const void *key2), void (*destroy)(void *data)) { /***************************************************************************** * * * Initialize the set. * * * *****************************************************************************/ list_init(set, destroy); set->match = match; return; } /***************************************************************************** * * * ------------------------------ set_insert ------------------------------ * * * *****************************************************************************/ int set_insert(Set *set, const void *data) { /***************************************************************************** * * * Do not allow the insertion of duplicates. * * * *****************************************************************************/ if (set_is_member(set, data)) return 1; /***************************************************************************** * * * Insert the data. * * * *****************************************************************************/ return list_ins_next(set, list_tail(set), data); } /***************************************************************************** * * * ------------------------------ set_remove ------------------------------ * * * *****************************************************************************/ int set_remove(Set *set, void **data) { ListElmt *member, *prev; /***************************************************************************** * * * Find the member to remove. * * * *****************************************************************************/ prev = NULL; for (member = list_head(set); member != NULL; member = list_next(member)) { if (set->match(*data, list_data(member))) break; prev = member; } /***************************************************************************** * * * Return if the member was not found. * * * *****************************************************************************/ if (member == NULL) return -1; /***************************************************************************** * * * Remove the member. * * * *****************************************************************************/ return list_rem_next(set, prev, data); } /***************************************************************************** * * * ------------------------------- set_union ------------------------------ * * * *****************************************************************************/ int set_union(Set *setu, const Set *set1, const Set *set2) { ListElmt *member; void *data; /***************************************************************************** * * * Initialize the set for the union. * * * *****************************************************************************/ set_init(setu, set1->match, NULL); /***************************************************************************** * * * Insert the members of the first set. * * * *****************************************************************************/ for (member = list_head(set1); member != NULL; member = list_next(member)) { data = list_data(member); if (list_ins_next(setu, list_tail(setu), data) != 0) { set_destroy(setu); return -1; } } /***************************************************************************** * * * Insert the members of the second set. * * * *****************************************************************************/ for (member = list_head(set2); member != NULL; member = list_next(member)) { if (set_is_member(set1, list_data(member))) { /*********************************************************************** * * * Do not allow the insertion of duplicates. * * * ***********************************************************************/ continue; } else { data = list_data(member); if (list_ins_next(setu, list_tail(setu), data) != 0) { set_destroy(setu); return -1; } } } return 0; } /***************************************************************************** * * * --------------------------- set_intersection --------------------------- * * * *****************************************************************************/ int set_intersection(Set *seti, const Set *set1, const Set *set2) { ListElmt *member; void *data; /***************************************************************************** * * * Initialize the set for the intersection. * * * *****************************************************************************/ set_init(seti, set1->match, NULL); /***************************************************************************** * * * Insert the members present in both sets. * * * *****************************************************************************/ for (member = list_head(set1); member != NULL; member = list_next(member)) { if (set_is_member(set2, list_data(member))) { data = list_data(member); if (list_ins_next(seti, list_tail(seti), data) != 0) { set_destroy(seti); return -1; } } } return 0; } /***************************************************************************** * * * ---------------------------- set_difference ---------------------------- * * * *****************************************************************************/ int set_difference(Set *setd, const Set *set1, const Set *set2) { ListElmt *member; void *data; /***************************************************************************** * * * Initialize the set for the difference. * * * *****************************************************************************/ set_init(setd, set1->match, NULL); /***************************************************************************** * * * Insert the members from set1 not in set2. * * * *****************************************************************************/ for (member = list_head(set1); member != NULL; member = list_next(member)) { if (!set_is_member(set2, list_data(member))) { data = list_data(member); if (list_ins_next(setd, list_tail(setd), data) != 0) { set_destroy(setd); return -1; } } } return 0; } /***************************************************************************** * * * ----------------------------- set_is_member ---------------------------- * * * *****************************************************************************/ int set_is_member(const Set *set, const void *data) { ListElmt *member; /***************************************************************************** * * * Determine if the data is a member of the set. * * * *****************************************************************************/ for (member = list_head(set); member != NULL; member = list_next(member)) { if (set->match(data, list_data(member))) return 1; } return 0; } /***************************************************************************** * * * ----------------------------- set_is_subset ---------------------------- * * * *****************************************************************************/ int set_is_subset(const Set *set1, const Set *set2) { ListElmt *member; /***************************************************************************** * * * Do a quick test to rule out some cases. * * * *****************************************************************************/ if (set_size(set1) > set_size(set2)) return 0; /***************************************************************************** * * * Determine if set1 is a subset of set2. * * * *****************************************************************************/ for (member = list_head(set1); member != NULL; member = list_next(member)) { if (!set_is_member(set2, list_data(member))) return 0; } return 1; } /***************************************************************************** * * * ------------------------------ set_is_equal ---------------------------- * * * *****************************************************************************/ int set_is_equal(const Set *set1, const Set *set2) { /***************************************************************************** * * * Do a quick test to rule out some cases. * * * *****************************************************************************/ if (set_size(set1) != set_size(set2)) return 0; /***************************************************************************** * * * Sets of the same size are equal if they are subsets. * * * *****************************************************************************/ return set_is_subset(set1, set2); }