At this stage, we should have done almost all of the analysis and development needed for building an anomaly detector, or in general a data product using deep learning.
We are only left with final, but not less important, step: the deployment.
Deployment is generally very specific of the use case and enterprise infrastructure. In this section, we will cover some common approaches used in general data science production systems.
In the Testing section, we summarized all the different entities in a machine learning pipeline. In particular, we have seen the definition and differences of a model, a fitted model and the learning algorithm. After we have trained, validated, and selected the final model, we have a final fitted version of it ready to be used. During the testing phase (except in A/B testing), we have scored only historical data that was generally already available in the machines where we trained the model.
In enterprise architectures, it is common to have a Data Science cluster wherein you build a model and the production environment where you deploy and use the fitted model.
One common way is to export a fitted model is Plain Old Java Object (POJO). The main advantage of POJO is that it can be easily integrated within a Java app and scheduled to run on a specific dataset or deployed to score in real-time.
H2O allows you to extract a fitted model programmatically or from the Flow Web UI, which we have not covered in this book.
If model
is your fitted model, you can save it as POJO jar
in the specified path
by running:
model.download_pojo(path)
The POJO jar contains a standalone Java class of the base class hex.genmodel.easy.EasyPredictModelWrapper
, with no dependencies on the training data or the entire H2O framework but only the h2o-genmodel.jar
file, which defines the POJO interfaces. It can be read and used from anything that runs in a JVM.
The POJO object will contain the model class name corresponding to the model id used in H2O (model.id
) and the model category for anomaly detection will be hex.ModelCategory.AutoEncoder
.
Unfortunately, at the time of writing this chapter, there is still an open issue over implementing the Easy API for AutoEncoder: https://0xdata.atlassian.net/browse/PUBDEV-2232.
Roberto Rösler, from the h2ostream mailing list, solved this problem by implementing its own version of the AutoEncoderModelPrediction
class as:
public class AutoEncoderModelPrediction extends AbstractPrediction { public double[] predictions; public double[] feature; public double[] reconstrunctionError; public double averageReconstructionError; }
And modified the method predictAutoEncoder
in the EasyPredictModelWrapper
as:
public AutoEncoderModelPrediction predictAutoEncoder(RowData data) throws PredictException { double[] preds = preamble(ModelCategory.AutoEncoder, data); // save predictions AutoEncoderModelPrediction p = new AutoEncoderModelPrediction(); p.predictions = preds; // save raw data double[] rawData = new double[m.nfeatures()]; setToNaN(rawData); fillRawData(data, rawData); p.feature = rawData; //calculate and reconstruction error double[] reconstrunctionError = new double [rawData.length]; for (int i = 0; i < reconstrunctionError.length; i++) { reconstrunctionError[i] = Math.pow(rawData[i] - preds[i],2); } p.reconstrunctionError = reconstrunctionError; //calculate mean squared error double sum = 0; for (int i = 0; i < reconstrunctionError.length; i++) { sum = sum + reconstrunctionError[i]; } p.averageReconstructionError = sum/reconstrunctionError.length; return p; }
The custom modified API will expose a method for retrieving the reconstruction error on each predicted row.
In order to make the POJO model to work, we must specify the same data format used during training. The data should be loaded into hex.genmodel.easy.RowData
objects that are simply instances of java.util.Hashmap<String, Object>.
When you create a RowData
object you must ensure these things:
H2OFrame
are used. For categorical columns, you must use String. For numerical columns, you can either use Double or String. Different column types are not supported.convertUnknownCategoricalLevelsToNa
to true in the model wrapper.This last requirement is probably the trickiest one. If our machine learning pipeline is made of a bunch of transformers, those must be exactly replicated in the deployment. Thus, the POJO
class is not enough and should also be accompanied by all of the remaining steps in the pipeline in addition to the H2O neural network.
Here is an example of a Java main method that reads some data and scores it against an exported POJO
class:
import java.io.*; import hex.genmodel.easy.RowData; import hex.genmodel.easy.EasyPredictModelWrapper; import hex.genmodel.easy.prediction.*; public class main { public static String modelClassName = "autoencoder_pojo_test"; public static void main(String[] args) throws Exception { hex.genmodel.GenModel rawModel; rawModel = (hex.genmodel.GenModel) Class.forName(modelClassName).newInstance(); EasyPredictModelWrapper model = new EasyPredictModelWrapper(rawModel); RowData row = new RowData(); row.put("Feature1", "value1"); row.put("Feature2", "value2"); row.put("Feature3", "value3"); AutoEncoderModelPrediction p = model.predictAutoEncoder(row); System.out.println("Reconstruction error is: " + p.averageReconstructionError); } }
We have seen an example of how to instantiate the POJO model as a Java class and use it for scoring a mock data point. We can re-adapt this code to be integrated within an existing enterprise JVM-based system. If you are integrating it in Spark, you can simply wrap the logic we have implemented in the example main class within a function and call it from a map method on a Spark data collection. All you need is the model POJO jar to be loaded into the JVM where you want to make the predictions. Alternatively, if your enterprise stack is JVM-based, there are a few util entry points, such as hex.genmodel.PredictCsv
. It allows you to specify a csv input file and a path where the output will be stored. Since AutoEncoder
is not yet supported in the Easy API, you will have to modify the PredictCsv
main class according to the custom patch we have seen before. Another architecture could be like this: you use Python to build the model and a JVM-based application for the production deployment.
Exporting the model as a POJO class is one way to programmatically include it in an existing JVM system, pretty much like the way you import an external library.
There are a bunch of other situations where the integration works better using a self-containing API, such as in a micro-services architecture or non-JVM-based systems.
H2O offers the capability of wrapping the trained model in a REST API to call specifying the row data to score via a JSON object attached to an HTTP request. The backend implementation behind the REST API is capable of performing everything you would do with the Python H2O API, included the pre and post-processing steps.
The REST API is accessible from:
In spite of the POJO class, the REST API offered by H2O depends on a running instance of the H2O cluster. You can access the REST API at http://hostname:54321
followed by the API version (latest is 3); and the resource path, for example, http://hostname:54321/3/Frames
will return the list of all Frames.
REST
APIs supports five verbs or methods: GET
, POST
, PUT
, PATCH
, and DELETE
.
GET
is used to read a resource with no side-effects, POST
to create a new resource, PUT to update and replace entirely an existing resource, PATCH
to modify a part of an existing resource, and DELETE
to delete a resource. The H2O REST
API does not support the PATCH
method and adds a new method called HEAD
. It is like a GET
request but returns only the HTTP
status, useful to check whether a resource exists or not without loading it.
Endpoints in H2O could be Frames, Models, or Clouds, which are pieces of information related to the status of nodes in the H2O cluster.
Each endpoint will specify its own payload and schema, and the documentation can be found on http://docs.h2o.ai/h2o/latest-stable/h2o-docs/rest-api-reference.html.
H2O provides in the Python module a connection handler for all the REST requests:
with H2OConnection.open(url='http://hostname:54321') as hc: hc.info().pprint()
The hc
object has a method called request
that can be used to send REST
requests:
hc.request(endpoint='GET /3/Frames')
Data payloads for POST
requests can be added using either the argument data
(x-www format) or json
(json format) and specifying a dictionary of key-value pairs. Uploading a file happens by specifying the filename
argument mapping to a local file path.
At this stage, whether we use the Python module or any REST client, we must do the following steps in order to upload some data and get the model scores back:
POST /3/ImportFiles
by using an ImporFilesV3
schema, including a remote path from where to load data (via http, s3, or other protocols). The corresponding destination frame name will be the file path:POST /3/ImportFiles HTTP/1.1 Content-Type: application/json { "path" : "http://s3.amazonaws.com/my-data.csv" }
POST /3/ParseSetup HTTP/1.1 Content-Type: application/json { "source_frames" : "http://s3.amazonaws.com/my-data.csv" }
POST /3/Parse HTTP/1.1 Content-Type: application/json { "destination_frame" : "my-data.hex" , source_frames : [ "http://s3.amazonaws.com/my-data.csv" ] , parse_type : "CSV" , "number_of_columns" : "3" , "columns_types" : [ "Numeric", "Numeric", "Numeric" ] , "delete_on_done" : "true" }
GET /3/Jobs/$job_name HTTP/1.1
POST /3/Predictions/models/$model_name/frames/$frame_name HTTP/1.1 Content-Type: application/json { "predictions_frame" : "$prediction_name" , "reconstruction_error" : "true" , "reconstruction_error_per_feature" : "false" , "deep_features_hidden_layer" : 2 }
DELETE /3/Frames/$frame_name DELETE /3/Frames/$prediction_name
Let's analyze the input and output of the Predictions API. reconstruction_error
, reconstruction_error_per_feature
, and deep_features_hidden_layer
are specific parameters for AutoEncoder models and determine what will be included in the output. The output is an array of model_metrics
where for AutoEncoder will contain:
We have seen two options for exporting and deploying a trained model: exporting it as a POJO
and incorporating it into a JVM-based application or using the REST API to call a model which is already loaded into a running H2O instance.
Generally, using POJO
is a better choice because it does not depend on a running H2O cluster. Thus, you can use H2O for building the model and then deploy it on any other system.
The REST API will be useful if you want to achieve more flexibility and being able to generate predictions from any client at any time as long as the H2O cluster is running. The procedure, though, requires multiple steps compared to the POJO
deployment.
Another recommended architecture is to use the exported POJO
and wrap it within a JVM REST API using frameworks such as Jersey for Java and Play or akka-http
if you prefer Scala. Building your own API means you can define programmatically the way you want to accept input data and what you want to return as output in a single request, as opposed to the multiple steps in H2O. Moreover, your REST
API could be stateless. That is, you don't need to import data into frames and delete them afterwards.
Ultimately, if you want your POJO-based REST API to be easily ported and deployed everywhere, it is recommended to wrap it in a virtual container using Docker. Docker is an open source framework that allows you to wrap a piece of software in a complete filesystem that contains everything you need to run: code, runtime, system tools, libraries and everything you need to have installed. In such a way, you have a single lightweight container that can always run the same service in every environment.
A Dockerized API can easily be shipped and deployed to any of your production servers.