Using ML data in TSVB

As we explained at the beginning of this chapter, the TSVB is a perfect fit for the metric analytics use case. It renders the chart using the Elasticsearch aggregation API, including the pipeline API, which allows us to process aggregation metric results with mathematical functions.

The example we'll go through will use the data from the first job, render the traffic in the NASA access logs, and annotate the traffic with three levels of anomalies: minor, major, and critical. We'll render them yellow, orange, and red, respectively.

Let's start and create a TSVB chart from the visualization palette (click Visualize, then the Add button, and then choose Visual Builder from the palette). The first thing you will see is the following empty chart:

Take note of the following:

  • The Data tab allows you to configure the aggregation and function to apply to the data.
  • The Panel Options tab is where the data source is set.
  • The Annotations tab is where we'll use a second, different data source. This will be where our ML data will be defined so that we can annotate the chart.

First, let's configure the data source in the Panel Options, like so:

As you can see, the configuration is quite simple: we are just passing our index pattern name, that is, nasa-*, and the Time Field @timestamp. The rest of the options can be left as they are by default.

We need to leverage the anomaly detection job data to annotate the preceding chart and help users understand where (and when) to pay attention. First, let's add the critical (red) anomalies, as illustrated in the following screenshot:

The annotations allow you to render the preceding vertical lines on top of the main data source. You can do this by choosing the following:

  • A color, red, for critical.
  • An index, such as .ml-anomalies*, which is the index pattern pointing to the results of anomaly detection jobs.
  • The time field (here, this is timestamp).
  • A query string to filter the data. We have set a condition where all anomalies with an anomaly_score above 70 are critical anomalies (as opposed to the standard value of 75 that the ML UI uses). In addition, since all of the results across jobs are stored in the same index, we only need to filter out the results of our job. This is why we have added job_id:nasa-traffic, where nasa-traffic is the name of our first ML job.
  • An icon (here, this is a bomb) that characterizes a critical anomaly.
  • Required fields; we just need the anomaly_score to render our annotation.
  • A template to be rendered when hovering over an annotation. Here, this is the anomaly_score itself.

Just reproduce the same thing for a major anomaly by changing the query, the color to orange, and the icon:

The condition we have set for a major anomaly is an anomaly_score between 30 and 70. Do the same process for minor (blue) anomalies (for scores between 1 and 30), as shown in the following screenshot:

We haven't configured any particular aspect in the Data tab as the default count aggregation is what we are looking for. However, you can also edit the label to give a little more sense to the chart. With the label and annotations in place, here is how we are overlaying ML data on top of the traffic flow:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset