Huffman Compression 401
FIGURE 24.2: (a) The characters are sorted by the frequencies. (b) A linked list is created.
List nodes are expressed by rectangles. Tree nodes are expressed by ovals.
frequencies. Each list node points to a tree node. At the beginning of the algorithm, none
of the tree nodes have children.
Fig. 24.3 shows how to merge the first two list nodes. The first two list nodes, L and R,
are taken. A new tree node N is created and its left and right children are L and R. The
frequency of the newly created tree node is the sum of the frequencies of the two children.
The newly created tree node is not a leaf node so its character is irrelevant. A new list node
is created and points to the newly created tree node. The list nodes must remain sorted in
the ascending order by the tree nodes’ frequencies. These three steps have removed the two
trees from the list with the smallest frequencies, combined them into a single tree, and then
placed the new tree back onto the list, keeping the list sorted by the frequencies.
Fig. 24.4 to Fig. 24.6 repeat the same steps: (i) taking the first two tree nodes, (ii)
creating a parent node (the node’s frequency is the sum of the frequencies of the two
children), (iii) creating a list node, and (iv) inserting the list node and keeping the list
sorted. As a result, two nodes are moved from the list, and one is added, giving a net
change of removing one list node. The linked list becomes shorter.
It is important to understand the concept that is described in the figures. The first part
of the program is described below.
1. A tree structure is created. Each tree node stores a character and the character’s
frequency.
// tree . h1
#i f n d e f TREE_H2
#d ef in e TREE_H3
typedef s t ru ct treenode4
{5
st ruc t treenode * left ;6
st ruc t treenode * right ;7