Using the new SQL integration

Elasticsearch SQL was a feature that was introduced in version 6.3, and allows users to query Elasticsearch with SQL queries.

The API is available through the X-Pack API to query an index, like the one shown in the following example:

POST /_xpack/sql?format=txt 
{ 
    "query": "SELECT * FROM my_index WHERE release_date < '2000-01-01'" 
} 
If you want more details on this API, check out the documentation at https://www.elastic.co/guide/en/elasticsearch/reference/current/xpack-sql.html.

To switch and use Elasticsearch SQL as the preferred data source, simply click on the Data button when your data table is selected and choose Change your data source. Then, pick Elasticsearch SQL. In the text area, you can paste any Elasticsearch SQL-compliant query. For example:

SELECT job_id, timestamp, anomaly_score FROM ".ml-anomalies-*" WHERE job_id='nginx-traffic' AND result_type='bucket' ORDER BY anomaly_score DESC

This appears as follows:

The result is exactly the same as in the previous example, where we queried using Elasticsearch's Lucene query string syntax:

Notice in the request that we limited our query to only select result_type:bucket. However, as we described in Chapter 6Alerting on ML Analysis, there are many other result types that can be queried. The full list of all result types can be easily retrieved, however, by making the following request in the Dev Tools Console of Kibana:

GET .ml-anomalies-*/_search
{
"size":0,
"aggs":{
"result_type":{
"terms":{
"field":"result_type"
}
}
}
}

This gives you the following list:

  • bucket
  • model_plot
  • bucket_influencer
  • influencer
  • record
  • model_size_stats
  • model_forecast
  • model_forecast_request_stats
You can find the respective definitions in the documentation at https://www.elastic.co/guide/en/elasticsearch/reference/current/ml-results-resource.html.

As for the next step, let's customize our SQL query a little bit more and add a clause that will limit the results to only those where the anomaly_score>10 to get rid of all anomalies with a very low score. Simply click on the Expression editor button and modify the Canvas expression to be as follows:

filters
| essql
query="SELECT job_id, timestamp, anomaly_score FROM ".ml-anomalies-*" WHERE job_id='nginx-traffic' AND result_type='bucket' AND anomaly_score>10 ORDER BY anomaly_score DESC"
| table
| render

Now, let's change the raw data table into a horizontal bar chart by replacing the table part of the Canvas expression. In its entirety, this would now be as follows:

filters
| essql
query="SELECT job_id, timestamp, anomaly_score FROM ".ml-anomalies-*" WHERE job_id='nginx-traffic' AND result_type='bucket' AND anomaly_score>10 ORDER BY anomaly_score DESC"
| pointseries x="anomaly_score" y="timestamp" color="job_id"
| plot defaultStyle={seriesStyle lines=0 bars="0.75" points=0 horizontalBars=true} legend=false
| render

In detail, here's what changed:

  • We removed, between the query and the render functions, the table function as it's not needed to render the bar chart
  • We added a pointseries function to build the data series that will be consumed by the bar chart
  • We added a plot function to plot the series neither as a line chart (lines=0) nor a points chart (points=0), but as a horizontal (horizontalBars=true) bar chart (bars="0.75")

This is what you should obtain:

This gives us a bar chart representing top anomalies over time. This might be interesting, but we have yet to leverage the biggest advantage of Canvas over other visualizations in Kibana, the ability to create a pixel-perfect infographic.

So, let's be a little bit more creative and build a very customized slide to represent a fake network operation center. To do so, we can leverage online resources, such as the https://www.freepik.com/ website, which provides a huge amount of vectorized pictures that can be then exported to a PNG file with Adobe Illustrator for use in Canvas. In addition to a few eye candy pictures, we can perhaps create another data element that shows a simple single metric value for the total number of anomalies seen for our ML job.

One way to accomplish this is with the Metric element. Another is with the Markdown element. Let's choose the latter and have the Markdown element display the number of rows that our SQL query returns (as in the total number of anomalies), as defined in the following expression:

filters
| essql
query="SELECT timestamp, anomaly_score FROM ".ml-anomalies-shared" WHERE result_type = 'bucket' AND anomaly_score > 10 AND job_id = 'nginx-traffic'"
| markdown
"#
#
# {{rows.length}}"
| render
For more information on Markdown or Handlebars, consult the following links:

Here, we're using the rows variable that was returned by the data source component in Canvas. The end result is a simple number in a transparent box: 

We need to add a bit of customization to our markdown object, such as the background color, the font, the color, and so on. Feel free to play with the different element style settings to make this element more visually appealing. You can even add your own CSS, for example:

filters
| essql
query="SELECT timestamp, anomaly_score FROM ".ml-anomalies-*" WHERE result_type = 'bucket' AND anomaly_score > 10 AND job_id = 'nginx-traffic'"
| markdown "
#
#
# {{rows.length}}"
| render css="h1 {
text-align: center;
color: #ff1744;
}
"
containerStyle={containerStyle backgroundColor="#444444" border="5px none #FFFFFF" borderRadius="7px" padding="px"}

This will make the element look as follows:

This gives us a lot of opportunities to create a look and feel that will suit any report, even if you want to match your corporate branding or color palette. You can even leverage the fact that markdown elements can be made clickable to allow the user to link to other Canvas workpads or other standard dashboards or visualizations.

The same type of customization that's applied to the bar chart gives me the following expression:

filters
| essql
query="SELECT timestamp, anomaly_score FROM ".ml-anomalies-shared" WHERE result_type = 'bucket' AND anomaly_score > 10 AND job_id = 'nginx-traffic'"
| pointseries x="anomaly_score" y="timestamp"
| plot
defaultStyle={seriesStyle lines=0 bars="2" points=0 horizontalBars=true color="#d32f2f"} legend=false xaxis=true yaxis=true
font={font family="'Open Sans', Helvetica, Arial, sans-serif" size=12 align="left" color="#FFFFFF" weight="normal" underline=false italic=false}
| render containerStyle={containerStyle backgroundColor="#444444"}

This gives us the following rendering:

At this point, you can let your creativity do the rest of the work. We will add some static images of furniture and a silly portrait of one of our colleagues to round out our custom, data-driven infographic: 

As you can see, we are probably not going to be hired for our graphic design capabilities! If you want to see a beautiful example of how interactive and visually stunning a Canvas dashboard can be, take a look at this particular example on the Elastic blog at https://www.elastic.co/blog/monitoring-airport-security-operations-with-canvas-and-elasticsearch.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset