Defining the TreeNode class

The overall Tree class relies on the TreeNode class to handle the implementation details of adding, removing, and iterating through the various items in the bag. This class is rather large, so we'll present it in four sections.

The first part shows the essential elements of initialization, representation, and how the attributes are made visible:

class TreeNode:

    def __init__(
        self,
        item: Optional[Comparable],
        less: Optional["TreeNode"] = None,
        more: Optional["TreeNode"] = None,
        parent: Optional["TreeNode"] = None,
    ) -> None:
        self.item = item
        self.less = less
        self.more = more
        if parent:
            self.parent = parent

    @property
    def parent(self) -> Optional["TreeNode"]:
        return self.parent_ref()

    @parent.setter
    def parent(self, value: "TreeNode"):
        self.parent_ref = weakref.ref(value)

    def __repr__(self) -> str:
        return f"TreeNode({self.item!r}, {self.less!r}, {self.more!r})"

Each individual node must have a reference to an item. Additional nodes with items that compare as less than the given item, or more than the given item are optional. Similarly, a parent node is also optional.

The property definitions for the parent methods are used to ensure that the parent attribute is actually a weakref attribute, but it appears like a strong reference. For more information on weak references, see Chapter 3, Integrating Seamlessly - Basic Special Methods. We have mutual references between a TreeNode parent object and its children objects; this circularity could make it difficult to remove TreeNode objects. Using a weakref breaks the circularity, allowing reference counting to remove nodes quickly when they are no longer referenced.

Note that the TreeNode type hints are references to the class from inside the class definition. This circularity can be a syntax problem because the class hasn't been fully defined. To make valid self-referential type hints, mypy allows the use of a string. When mypy is run, the string resolves to the proper type object.

Here are the methods for finding and iterating through the nodes:

def find(self, item: Comparable) -> "TreeNode":
    if self.item is None:  # Root
        if self.more:
            return self.more.find(item)
    elif self.item == item:
        return self
    elif self.item > item and self.less:
        return self.less.find(item)
    elif self.item < item and self.more:
        return self.more.find(item)
    raise KeyError

def __iter__(self) -> Iterator[Comparable]:
    if self.less:
        yield from self.less
    if self.item:
        yield self.item
    if self.more:
        yield from self.more

We saw the find() method, which performs a recursive search from a tree through the appropriate subtree looking for the target item. There are a total of six cases:

When this is the root node of the tree, we'll simply skip it.
When this node has the target item, we'll return it.
When the target item is less than this node's item and there is a branch on the less side, we'll descend into that subtree to search.
When the target item is more than this node's item and there is a branch on the more side, we'll descend into that subtree to search.
There are two remaining cases: the target item is less than the current node, but there's no less branch or the target item is more than the current node, but there's no more branch. These two cases both mean the item cannot be found in the tree, leading to a KeyError exception.

The __iter__() method does what's called an inorder traversal of this node and its subtrees. As is typical, this is a generator function that yields the values from iterators over each collection of subtrees. Although we could create a separate iterator class that's tied to our Tree class, there's little benefit when this generator function does everything we need.

The result of the __iter__() has a type hint of Iterator[Comparable]. This reflects the minimal constraint placed on the items contained in the tree.

Here's the next part of this class to add a new node to a tree:

def add(self, item: Comparable):
    if self.item is None:  # Root Special Case
        if self.more:
            self.more.add(item)
        else:
            self.more = TreeNode(item, parent=self)
    elif self.item >= item:
        if self.less:
            self.less.add(item)
        else:
            self.less = TreeNode(item, parent=self)
    elif self.item < item:
        if self.more:
            self.more.add(item)
        else:
            self.more = TreeNode(item, parent=self)

This is the recursive search for the proper place to add a new node. The structure parallels the find() method.

Finally, we have the (more complex) processing to remove a node from the tree. This requires some care to relink the tree around the missing node:

def remove(self, item: Comparable):
    # Recursive search for node
    if self.item is None or item > self.item:
        if self.more:
            self.more.remove(item)
        else:
            raise KeyError
    elif item < self.item:
        if self.less:
            self.less.remove(item)
        else:
            raise KeyError
    else:  # self.item == item
        if self.less and self.more:  # Two children are present
            successor = self.more._least()
            self.item = successor.item
            if successor.item:
                successor.remove(successor.item)
        elif self.less:  # One child on less
            self._replace(self.less)
        elif self.more:  # One child on more
            self._replace(self.more)
        else:  # Zero children
            self._replace(None)

def _least(self) -> "TreeNode":
    if self.less is None:
        return self
    return self.less._least()

def _replace(self, new: Optional["TreeNode"] = None) -> None:
    if self.parent:
        if self == self.parent.less:
            self.parent.less = new
        else:
            self.parent.more = new
    if new is not None:
        new.parent = self.parent

The remove() method has two sections. The first part is the recursive search for the target node.

Once the node is found, there are three cases to consider:

When we delete a node with no children, we simply delete it and update the parent to replace the link with None. The use of a weak reference from the removed node back to the parent, means memory cleanup and reuse is immediate.
When we delete a node with one child, we can push the single child up to replace this node under the parent.
When there are two children, we need to restructure the tree. We locate the successor node (the least item in the more subtree). We can replace the to-be-removed node with the content of this successor. Then, we can remove the duplicative former successor node.

We rely on two private methods. The _least() method performs a recursive search for the least-valued node in a given tree. The _replace() method examines a parent to see whether it should touch the less or more attribute.

Table of Contents for Defining the TreeNode class

Create new playlist

Sign In

Sign Up

Table of Contents for
Defining the TreeNode class