Don't over-engineer the use case

I once worked with a user where we discussed different use cases for ML. In particular, this customer was building a hosted security operations center as part of their managed security service provider (MSSP) business, so they were keen to think about use cases in which ML could help.

A high-level theme to their use cases was to look at a user's behavior and find unexpected behavior. One example that was discussed was login activity from unusual/rare locations such as Bob just logged in from Ukraine, but he doesn't normally log in from there.

In the process of thinking the implementation through, there was talk of them having multiple clients, each of which had multiple users. Therefore, they were thinking of ways to split/partition the data so that they could execute "rare" by country for each and every user of every client.

I asked them to take a step back and said "Is it worthy of an anomaly if anyone logs in from Ukraine, not just Bob?" to which the answer was "Yes."

So, in this case, there is no point in splitting the analysis out per user, perhaps just keep the partitioning at the client level and simply lump all of the user's locations from each client into a single pool of observed countries. This is actually a better scenario; there's more overall data, and as we know, the rare function works best when there's lots of routine data to contrast a novel observation against.

Likewise, they were also interested in detecting excessive failed login attempts. Again, their original idea was to track the expected/normal number of logins for every user of every client. Again, this is not really necessary. Simply tracking the typical rate of login activity for the population of users within a client is good enough. It again solves the sparse data issue, and ultimately allows for a more scalable ML job since it is not expected to maintain baseline models for every single user.

The moral of the story here is as follows: don't over-engineer the use case if it isn't necessary.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset