Pattern mining with Spark MLlib

After having motivated and introduced three pattern mining problems along with the necessary notation to properly talk about them, we will next discuss how each of these problems can be solved with an algorithm available in Spark MLlib. As is often the case, actually applying the algorithms themselves is fairly simple due to Spark MLlib's convenient run method available for most algorithms. What is more challenging is to understand the algorithms and the intricacies that come with them. To this end, we will explain the three pattern mining algorithms one by one, and study how they are implemented and how to use them on toy examples. Only after having done all this will we apply these algorithms to a real-life data set of click events retrieved from http://MSNBC.com.

The documentation for the pattern mining algorithms in Spark can be found at https://spark.apache.org/docs/2.1.0/mllib-frequent-pattern-mining.html. It provides a good entry point with examples for users who want to dive right in.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset