Chapter 12. 11¾ A Little Bit of Threading: Dealing with Waiting

image with no caption

Your code can sometimes take a long time to execute. Depending on who notices, this may or may not be an issue. If some code takes 30 seconds to do its thing “behind the scenes,” the wait may not be an issue. However, if your user is waiting for your application to respond, and it takes 30 seconds, everyone notices. What you should do to fix this problem depends on what you’re trying to do (and who’s doing the waiting). In this short chapter, we’ll briefly discuss some options, then look at one solution to the issue at hand: what happens if something takes too long?

Waiting: What to Do?

When you write code that has the potential to make your users wait, you need to think carefully about what it is you are trying to do. Let’s consider some points of view.

image with no caption

Maybe it is the case that waiting for a write is different from waiting for a read, especially as it relates to how your webapp works?

Let’s take another look at the SQL queries in log_request and view_the_log to see how you’re using them.

How Are You Querying Your Database?

In the log_request function, we are using an SQL INSERT to add details of the request to our backend database. When log_request is called, it waits while the INSERT is executed by cursor.execute:

image with no caption

Geek Bits

image with no caption

Code that waits for something external to complete is referred to as “blocking code,” in that the execution of your program is blocked from continuing until the wait is over. As a general rule, blocking code that takes a noticeable length of time is bad.

The same holds for the view_the_log function, which also waits whenever the SQL SELECT query is executed:

image with no caption

Both functions block. However, look closely at what happens after the call to cursor.execute in both functions. In log_request, the cursor.execute call is the last thing that function does, whereas in view_the_log, the results of cursor.execute are used by the rest of the function.

Let’s consider the implications of this difference.

Database INSERTs and SELECTs Are Different

If you’re reading the title to this page and thinking “Of course they are!”, be assured that (this late in this book) we haven’t lost our marbles.

Yes: an SQL INSERT is different from an SQL SELECT, but, as it relates to your use of both queries in your webapp, it turns out that the INSERT in log_request doesn’t need to block, whereas the SELECT in view_the_log does, which makes the queries very different.

This is a key observation.

If the SELECT in view_the_log doesn’t wait for the data to return from the backend database, the code that follows cursor.execute will likely fail (as it’ll have no data to work with). The view_the_log function must block, as it has to wait for data before proceeding.

When your webapp calls log_request, it wants the function to log the details of the current web request to the database. The calling code doesn’t really care when this happens, just that it does. The log_request function returns no value, nor data; the calling code isn’t waiting for a response. All the calling code cares about is that the web request is logged eventually.

Which begs the question: why does log_request force its callers to wait?

image with no caption

Yes. That’s our madcap idea.

When users of your webapp enter a new search, they couldn’t care less that the request details are logged to some backend database, so let’s not make them wait while your webapp does that work.

Instead, let’s arrange for some other process to do the logging eventually and independently of the webapp’s main function (which is to allow your users to perform searches).

Doing More Than One Thing at Once

Here’s the plan: you’re going to arrange for the log_request function to execute independently of your main webapp. To do this, you’re going to adjust your webapp’s code so each call to log_request runs concurrently. This will mean that your webapp no longer has to wait for log_request to complete before servicing another request from another user (i.e., no more delays).

If log_request takes an instant, a few seconds, a minute, or even hours to execute, your webapp doesn’t care (and neither does your user). What you care about is that the code eventually executes.

Concurrent code: you have options

When it comes to arranging for some of your application’s code to run concurrently, Python has a few options. As well as lots of support from third-party modules, the standard library comes with some built-in goodies that can help here.

One of the most well known is the threading library, which provides a high- level interface to the threading implementation provided by the operating system hosting your webapp. To use the library, all you need to do is import the Thread class from the threading module near the top of your program code:

from threading import Thread

Go ahead and add this line of code near the top of your vsearch4web.py file.

For the full list of (and all the details about) Python’s standard library concurrency options, see https://docs.python.org/3/library/concurrency.html.

Now the fun starts.

To create a new thread, you create a Thread object, assigning the name of the function you want the thread to execute to a named argument called target, and providing any arguments as a tuple to another named argument called args. The created Thread object is then assigned to a variable of your choosing.

As an example, let’s assume that you have a function called execute_slowly, which takes three arguments, which we’ll assume are three numbers. The code that invokes execute_slowly has assigned the three values to variables called glacial, plodding, and leaden. Here’s how execute_slowly is invoked normally (i.e., without our worrying about concurrent execution):

execute_slowly(glacial, plodding, leaden)

If execute_slowly takes 30 seconds to do what it has to do, the calling code blocks and waits for 30 seconds before doing anything else. Bummer.

Don’t Get Bummed Out: Use Threads

In the big scheme of things, waiting 30 seconds for the execute_slowly function to complete doesn’t sound like the end of the world. But, if your user is sitting and waiting, they’ll be wondering what’s gone wrong.

If your application can continue to run while execute_slowly goes about its business, you can create a Thread to run execute_slowly concurrently. Here’s the normal function call once more, together with the code that turns the function call into a request for threaded execution:

image with no caption

Granted, this use of Thread looks a little strange, but it’s not really. The key to understanding what’s going on here is to note that the Thread object has been assigned to a variable (t in this example), and that the execute_slowly function has yet to execute.

Assigning the Thread object to t prepares it for execution. To ask Python’s threading technology to run execute_slowly, start the thread like this:

image with no caption

At this point, the code that called t.start continues to run. The 30-second wait that results from running execute_slowly has no effect on the calling code, as execute_slowly’s execution is handled by Python’s threading module, not by you. The threading module conspires with the Python interpreter to run execute_slowly eventually.

First Things First: Don’t Panic

Your first instinct may be to back out the code you added to run log_request in its own thread (and get yourself back to a known good state). But let’s not panic, and let’s not do that. Instead, let’s take a look at that descriptive paragraph that appeared twice in the traceback messages:

        ...
This typically means that you attempted to use functionality that needed
an active HTTP request. Consult the documentation on testing for
information about how to avoid this problem.
        ...

This message is coming from Flask, not from the threading module. We know this because the threading module couldn’t care less about what you use it for, and definitely has no interest in what you’re trying to do with HTTP.

Let’s take another look at the code that schedules the thread for execution, which we know takes 15 seconds to run, as that’s how long log_request takes. While you’re looking at this code, think about what happens during that 15 seconds:

image with no caption

The instant the thread is scheduled for execution, the calling code (the do_search function) continues to execute. The render_template function executes (in the blink of an eye), and then the do_search function ends.

When do_search ends, all of the data associated with the function (its context) is reclaimed by the interpreter. The variables request, phrase, letters, title, and results cease to be. However, the request and results variables are passed as arguments to log_request, which tries to access them 15 seconds later. Unfortunately, at that point in time, the variables no longer exist, as do_search has ended. Bummer.

Don’t Get Bummed Out: Flask Can Help

Based on what you’ve just learned, it appears the log_request function (when executed within a thread) can no longer “see” its argument data. This is due to the fact that the interpreter has long since cleaned up after itself, and reclaimed the memory used by these variables (as do_search has ended). Specifically, the request object is no longer active, and when log_request goes looking for it, it can’t be found.

So, what can be done? Don’t fret: help is at hand.

image with no caption

There’s really no need for a rewrite.

At first glance, it might appear that you’d need to rewrite log_request to somehow rely less on its arguments... assuming that’s even possible. But it turns out that Flask comes with a decorator that can help here.

The decorator, copy_current_request_context, ensures that the HTTP request that is active when a function is called remains active even when the function is subsequently executed in a thread. To use it, you need to add copy_current_request_context to the list of imports at the top of your webapp’s code.

As with any other decorator, you apply it to an existing function using the usual @ syntax. However, there is a caveat: the function being decorated has to be defined within the function that calls it; the decorated function must be nested inside its caller (as an inner function).

Is Your Webapp Robust Now?

Here are the four questions posed at the start of Chapter 11:

  1. What happens if the database connection fails?

  2. Is our webapp protected from web attacks?

  3. What happens if something takes a long time?

  4. What happens if a function call fails?

Your webapp now handles a number of runtime exceptions, thanks to your use of try/except and some custom exceptions that you can raise and catch as required.

When you know something can go wrong at runtime, fortify your code against any exceptions that might occur. This improves the overall robustness of your application, which is a good thing.

Note that there are other areas where robustness could be improved. You spent a lot of time adding try/except code to view_the_log’s code, which took advantage of the UseDatabase context manager. UseDatabase is also used within log_request, and should probably be protected, too (and doing so is left as a homework exercise for you).

Your webapp is more responsive due to your use of threading to handle a task that has to be performed eventually, but not right away. This is a good design strategy, although you do need to be careful not to go overboard with threads: the threading example in this chapter is very straightforward. However, it is very easy to create threading code that nobody can understand, and which will drive you mad when you have to debug it. Use threads with care.

image with no caption

In answering question 3—what happens if something takes a long time?—the use of threads improved the performance of the database write, but not the database read. It is a case of just having to wait for the data to arrive after the read, no matter how long it takes, as the webapp wasn’t able to proceed without the data.

To make the database read go faster (assuming it’s actually slow in the first place), you may have to look at using an alternative (faster) database setup. But that’s a worry for another day that we won’t concern ourselves with further in this book.

However, having said that, in the next and last chapter, we do indeed consider performance, but we’ll be doing so while discussing a topic everyone understands, and which we’ve already discussed in this book: looping.

Chapter 11¾’s Code, 1 of 2

image with no caption

Chapter 11¾’s Code, 2 of 2

image with no caption
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset