Data warehousing

Using a message queuing system is just the first step in our data pipeline design. At the other end of message queuing, we would typically have a data warehouse to process the vast amount of data that arrives. There are numerous options there, and it is not the main focus of this book to go over these or compare them. However, we will skim through two of the most widely-used options from the Apache Software Foundation: Apache Hadoop and Apache Spark.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset