Chapter 1. PDI and MongoDB

In this chapter, we will cover these recipes:

  • Learning basic operations with Pentaho Data Integration
  • Migrating data from the RDBMS to MongoDB
  • Loading data from MongoDB to MySQL
  • Migrating data from files to MongoDB
  • Exporting MongoDB data using the aggregation framework
  • MongoDB Map/Reduce using the User Defined Java Class step and MongoDB Java Driver
  • Working with jobs and filtering MongoDB data using parameters and variables

Introduction

Migrating data from an RDBMS to a NoSQL database, such as MongoDB, isn't an easy task, especially when your RBDMS has a lot of tables. It can be a time consuming issue, and in most cases, using a manual process is like developing a bespoke solution.

Pentaho Data Integration (or PDI, also known as Kettle) is an Extract, Transform, and Load (ETL) tool that can be used as a solution for this problem. PDI provides a graphical drag-and-drop development environment called Spoon. Primarily, PDI is used to create data warehouses. However, it can also be used for other scenarios, such as migrating data between two databases, exporting data to files with different formats (flat, CSV, JSON, XML, and so on), loading data into databases from many different types of source data, data cleaning, integrating applications, and so on.

The following recipes will focus on the main operations that you need to know to work with PDI and MongoDB.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset