How it works...

At the time of writing this book,, the current version of GraphFrames is v0.5, which contains two implementations of PageRank:

  • The one we are using utilizes the GraphFrame interface and runs PageRank for a fixed number of iterations by setting maxIter
  • Another version uses the org.apache.spark.graphx.Pregel interface and runs PageRank until convergence by setting tol

For more information, please refer to the GraphFrames Scala documentation on PageRank at https://graphframes.github.io/api/scala/index.html#org.graphframes.lib.PageRank.

As noted previously, we are using the standalone GraphFrame version of PageRank by setting:

  • resetProbability: This is currently set to the default value of 0.15, which represents the probability of resetting to a random vertex. If the value is too high, it means that it will take longer to complete its calculation, but if the value is too low, the calculations may overshoot and not converge.

  • maxIter: For this demo, we have set the value to 5; the higher the number, the higher the probability of a more precise calculation. 
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset