Using one-sided ML functions to your advantage

Many people realize the usefulness of one-sided functions in ML, such as low_count, high_mean, and so on, to allow for the detection of anomalies only on the high side or on the low side. This is useful when you only care about a drop in revenue or a spike in response time.

However, when you care about deviations in both directions, you are often inclined to use just the regular function (such as count or mean). However, on some datasets, it is actually more optimal to use both the high and low version of the function as two separate detectors. Why is this the case and under what conditions, you might ask?

The condition where this makes sense is when the dynamic range of the possible deviations is asymmetrical. In other words, the magnitude of potential spikes in the data is far, far bigger than the magnitude of the potential drops, possibly because the count or sum of something cannot be less than zero. Let's look at the following screenshot:

Here, the two-sided sum function properly identifies the large spike with a critical anomaly on the left, but the lack of expected double bumps in the middle is identified with only warning anomalies. Again, this is because with a double-sided function the normalization process ranks all anomalies together. The magnitude (and therefore the unlikeliness) of the spike is far bigger than the lack of data around 18:00, so the anomaly scores are assigned relatively.

However, if the dataset was analyzed with two separate detectors, using an advanced job, that is, low_sum(num_trx) and high_sum(num_trx), then the results would look very different. Here's the result of the high side:

Here's the result of the low side:

Notice that the anomalies in the middle are now scored much higher (in this case, with a max score of 47 yellow). 

So now, when the two one-sided detectors are run together in the same job, you've optimized the dynamic range of each detector (since they have their own normalization table)!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset