Implementing container relationships

When we looked at Blog objects in Chapter 11, Storing and Retrieving Objects via Shelve, we defined a by_tag() method to emit useful dictionary representations of the posts organized by the relevant tag strings. The method had a definition with a type hint that was Dict[str, List[Dict[str, Any]]]. Because it provided information useful for creating tags with links, the by_tag() method was a helpful part of rendering a Blog instance as RST or HTML. Additionally, the entries property returns the complete collection of Post instances attached to the Blog.

Ideally, the application model class definitions such as Blog and Post are utterly divorced from the access layer objects, which persist them in the external storage. These two use cases suggest the model layer objects must have references to the access layer. There are several strategies, including the following:

A global Access object is used by client classes to perform these query operations. This breaks the encapsulation idea: a Blog is no longer a container for Post entries. Instead, an Access object is a container for both. The class definitions are simplified. All other processing is made more complex.

Include a reference to an access layer object within each Blog object, allowing a client class to work with a Blog object unaware of an access layer. This makes the model layer class definitions a bit more complex. It makes the client work somewhat simpler. The advantage of this technique is it makes the model layer objects behave more like complex Python objects. A complex object may be forced to fetch child objects from the database, but if this can be done transparently, it makes the overall application somewhat simpler.

As a concrete example, we'll add a Blog.by_tag() feature. The idea is to return a dictionary of tags and post information as a complex dictionary. This requires considerable work by an access layer object to locate and fetch the dictionary representations of Post instances.

There's no trivial mapping from the relational columns to objects. Therefore, an Access class must build each class of object. As an example, the method for fetching a Blog instance is shown as follows:

def get_blog(self, id: str) -> Blog:
    query_blog = """
        SELECT * FROM blog WHERE id=?
    """
    row = self.database.execute(query_blog, (id,)).fetchone()
    blog = Blog(id=row["ID"], title=row["TITLE"])
    blog._access = ref(self)
    return blog

The relational query retrieves the various attributes for recreating a Blog instance. In addition to creating the core fields, each Blog object has an optional _access attribute. This is not provided at initialization, nor is it part of the representation or comparison of Blog objects. The value is a weak reference to an instance of an Access class. This object will embody the rather complex SQL query required to do the work. This association is inserted by the access object each time a blog instance is retrieved.

Associating blogs, posts, and tags will require a rather complex SQL query. Here's the SELECT statement required to traverse the associations and locate tag phrases and associated post identifiers:

query_by_tag = """ 
    SELECT tag.phrase, post.id
    FROM tag 
    JOIN assoc_post_tag ON tag.id = assoc_post_tag.tag_id 
    JOIN post ON post.id = assoc_post_tag.post_id 
    JOIN blog ON post.blog_id = blog.id 
    WHERE blog.title=? 
"""

This query's result set is a table-like sequence of rows with two attributes: tag.phrase, post.id. The SELECT statement defines three join operations between the blog, post, tag, and assoc_post_tag tables. The rules are provided via the ON clauses within each JOIN specification. The tag table is joined with the assoc_post_tag table by matching values of the tag.id column with values of the assoc_post_tag.tag_id column. Similarly, the post table is associated by matching values of the post.id column with values of the assoc_post_tag.post_id column. Additionally, the blog table is associated by matching values of the post.blog_id column with values of the blog.id column.

The by_tag() method of the Access class using this query is as follows:

def post_by_tag(self, blog: Blog) -> Dict[str, List[Dict[str, Any]]]:
    results = self.database.execute(
        self.query_by_tag, (blog.title,))
    tags: DefaultDict[str, List[Dict[str, Any]]] = defaultdict(list)
    for phrase, post_id in results.fetchall():
        tags[phrase].append(asdict(self.get_post(post_id)))
    return tags

It's important to note this complex SQL query is dissociated from the table definitions. SQL is not an object-oriented programming language. There's no tidy class to bundle data and processing together. Using procedural programming with SQL like this tends to break the object model.

The by_tag() method of the blog class uses this method of the Access class to do the actual fetches of the various posts. A client application can then do the following:

blog = some_access_object.get_blog(id=1)
tag_link = blog.by_tag()

The returned Blog instance behaves as if it was a simple Python class definition with a simple collection of Post instances. To make this happen, we require some careful injection of access layer references into the returned objects.

In the next section, we'll see how to improve the performance with indices.

Table of Contents for Implementing container relationships

Create new playlist

Sign In

Sign Up

Table of Contents for
Implementing container relationships