Chapter 8. Sharing Models with Prediction Services

Thus far, we have examined how to build a variety of models with data sources ranging from standard 'tabular' data to text and images. However, this only accomplishes part of our goal in business analysis: we can generate predictions from a dataset, but we cannot easily share the results with colleagues or with other software systems within a company. We also cannot easily replicate the results as new data becomes available without manually re-running the sorts of analyses discussed in previous chapters or scale it to larger datasets over time. We will also have difficulty to use our models in a public setting, such as a company's website, without revealing the details of the analysis through the model parameters exposed in our code.

To overcome these challenges, the following chapter will describe how to build 'prediction services', web applications that encapsulate and automate the core components of data transformation, model fitting, and scoring of new observations that we have discussed in the context of predicative algorithms in prior sections. By packaging our analysis into a web application, we can both easily scale the modeling system and change implementations in the underlying algorithm, all the while making such changes invisible to the consumer (whether a human or other software system), who interacts with our predictive models by making requests to our application through web URLs and a standard REST application programmer interface (API). It also allows initialization and updates to the analysis to be automated through calls to the service, making the predictive modeling task consistent and replicable. Finally, by carefully parameterizing many of the steps, we can use the same service to interact with interchangeable data sources computation frameworks.

In essences, building a prediction service involves linking several of the components we have already discussed, such as data transformation and predictive modeling, with a set of new components that we will discuss in this chapter for the first time. To this end, we will cover the following topics:

  • How to instrument a basic web application and server using the Cherrypy and Flask frameworks
  • How to automate a generic modeling framework using a RESTful API
  • Scaling our system using a Spark computation framework
  • Storing the results of our predictive model in database systems for reporting applications we will discuss in Chapter 9, Reporting and Testing – Iterating on Analytic Systems

The architecture of a prediction service

Now with a clear goal in mind—to share and scale the results of our predictive modeling using a web application—what are the components required to accomplish this objective?

The first is the client: this could be either a web browser or simply a user entering a curl command in the terminal (see Aside). In either case, the client sends requests using hypertext transfer protocol (HTTP), a standard transport convention to retrieve or transmit information over a network (Berners-Lee, Tim, Roy Fielding, and Henrik Frystyk. Hypertext transfer protocol--HTTP/1.0. No. RFC 1945. 1996). An important feature of the HTTP standard is that the client and server do not have to 'know' anything about how the other is implemented (for example, which programming language is used to write these components) because the message will remain consistent between them regardless by virtue of following the HTTP standard.

The next component is the server, which receives HTTP requests from a client and forwards them to the application. You can think of it as the gateway for the requests from the client to our actual predictive modeling application. In Python, web servers and applications each conform to the Web Server Gateway Interface (WSGI), which specifies how the server and application should communicate. Like the HTTP requests between client and server, this standard allows the server and application to be modular as long as both consistently implement the interface. In fact, there could even be intervening middleware between the server and application that further modifies communication between the two: as long as the format of this communication remains constant, the details of each side of the interface are flexible. While we will use the Cherrypy library to build a server for our application, other common choices are Apache Tomcat and Nginx, both written in the Java programming language.

After the client request has been received and forwarded by the server, the application performs operations in response to the requests, and returns a value indicating the success or failure of the task. These requests could, for example, obtain for the predicted score for a particular user, update to the training dataset, or perform a round of model training.

Tip

Aside: The curl command

As part of testing our prediction service, it is useful to have a way to quickly issue commands to the server and observe the response we receive back. While we could do some of this interactively using the address bar of a web browser, it is not easy to script browser activities in cases where we want to run a number of tests or replicate a particular command. The curl command, found in most Linux command line terminals, is very useful for this purpose: the same requests (in terms of a URL) can be issued to the prediction service using the curl command as would be given in the browser, and this call can be automated using shell scripting. The curl application can be installed from https://curl.haxx.se/.

The web application relies upon server-side code to perform commands in response to requests from the web server. In our example, this server-side code is divided into several components: the first is a generic interface for the modeling logic, which specifies a standard way to construct predictive models, train them with an input dataset, and score incoming data. The second is an implementation of this framework using the logistic regression algorithm from Chapter 5, Putting Data in its Place – Classification Methods and Analysis. This code relies upon executing Spark jobs, which could be carried out either locally (on the same machine as the web application) or remotely (on a separate cluster).

The final piece of this chain is database systems that can persist information used by the prediction service This database could be as simple as file system on the same machine as the web server or as complex as a distributed database software. In our example we will use both Redis (a simple key-value store) and MongoDB (a NoSQL database) to store the data used in modeling, transient information about our application, and the model results themselves.

As we have emphasized, an important feature of these three components is that they are largely independent: because the WGSI standard defines how the web server and application communicate, we could change server and predictive model implementation, and as long as the commands used in the web application are the same, the code will still work since these commands are formatted in a consistent way.

Now that we have covered the basic components of a prediction service and how they communicate with one another, let us examine each in greater detail.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset