GraphX in context

Having seen a lot of applications of graph analytics throughout the chapter, a natural question to follow up with is how GraphX fits into other parts of the Spark ecosphere and how we can use it for machine learning applications in conjunction with systems like MLlib, which we have seen earlier.

The quick answer is that while the concept of graphs is limited to Spark GraphX only, due to the underlying vertex and edge RDDs of a graph, we can seamlessly talk to any other module of Spark. In fact, we have used many core RDD operations throughout the chapter, but it does not stop there. MLlib does make use of GraphX functionality in a few selected places, like Latent Dirichlet Analysis or Power Iteration Clustering, which are unfortunately beyond the scope of this chapter to explain. Instead, we focused on explaining the basics of GraphX from first principles. However, the reader is encouraged to apply what we have learnt in this chapter, together with the ones before, and experiment with the preceding algorithms. For sake of completeness, there is one machine learning algorithm completely implemented in GraphX, namely SVD++, which you can read more about at http://public.research.att.com/~volinsky/netflix/kdd08koren.pdf, and which is a graph-based recommender algorithm.

Table of Contents for GraphX in context

Create new playlist

Sign In

Sign Up

Table of Contents for
GraphX in context