Working with the Spark Key/Value API

In this chapter, we'll be working with the Spark key/value API. We will start by looking at the available transformations on key/value pairs. We will then learn how to use the aggregateByKey method instead of the groupBy() method. Later, we'll be looking at actions on key/value pairs and looking at the available partitioners on key/value data. At the end of this chapter, we'll be implementing an advanced partitioner that will be able to partition our data by range.

In this chapter, we will be covering the following topics:

  • Available actions on key/value pairs
  • Using aggregateByKey instead of groupBy()
  • Actions on key/value pairs
  • Available partitioners on key/value data
  • Implementing a custom partitioner
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset