Implementing the transaction batch producer

In this section, we will first discuss how to call a REST API to fetch BTC/USD transactions. Then we will see how to use Spark to deserialize the JSON payload into a well-typed distributed Dataset.

After that, we will introduce the parquet format and see how Spark makes it easy to save our transactions in this format.

With all of these building blocks, we will then implement our program in a purely functional way using the Test-Driven-Development (TDD) technique.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset