The PDI MongoDB Lookup step

As you know, it isn't possible to join different collections in MongoDB as it is in a typical relational database. Sometimes, this functionality is necessary and needs to be applied in other layers of your system. This is a gap in Pentaho Data Integration, and it was solved in a particular way by Ivy Information Systems in the same plugin that is mentioned in the previous recipe with the MongoDB Lookup step.

Getting ready

To get ready for this recipe, you will again need to start your ETL development environment Spoon. Make sure you have the MongoDB server running with the data from the previous chapters and the Ivy PDI MongoDB Steps plugin installed in the previous recipe.

How to do it…

Perform the following steps to use MongoDB Lookup:

  1. In Spoon, create a new transformation with the name mongodb-lookup.ktr.
  2. Select the Design tab in the left-hand-side view.
  3. From the Input category folder, find the Generate Rows step, and drag and drop it into the working area in the right-hand-side view.
  4. Double-click on the step to open the Generate Rows dialog.
  5. Set Step Name to Get Customer Name.
  6. Next, set the Limit field to 1.
  7. Add to the Fields table the name field as a String type with the value as Euro+ Shopping Channel.
  8. From the Big Data category folder, find the MongoDB Lookup step, and drag and drop it into the working area in the right-hand-side view.
  9. Connect the Get Customer Name step to the MongoDB Lookup step.
  10. Double-click on the step to open the MongoDB Lookup configuration dialog.
  11. Set Step Name to Get Customer Order Details.
  12. In the Configure connection tab, click on the Get DBs button and select the SteelWheels option for the Database field. Then, click on the Get collections button and select the Orders option for the Collection field.
  13. In the Fields tab, click on the Get fields button. You should get something like name = name by default. However, the collection name field is wrong; set it to customer.name.
  14. Click on the Get lookup fields button to get some of the possible fields available for the documents. Let's keep just the line, country, postalCode, priceEach, customerNumber, totalPrice, and orderLineNumber fields and remove the others, as you can see in this screenshot:
    How to do it…
  15. From the Flow category folder, find the Dummy (do nothing) step, and drag and drop it into the working area in the right-hand-side view.
  16. Connect the Get Customer Order Details step to the Dummy (do nothing) step.
  17. Double-click on the step to open the Dummy (do nothing) configuration dialog.
  18. Set Step Name to OUT.
  19. Click on the OK button. The transformation should be similar to what is shown in the following screenshot, and you may be able to preview the execution transformation and see the results:
    How to do it…

How it works…

This recipe guided you with a simple example of what you can do with the MongoDB Lookup step. We created a row with the Generate Rows step and then made the additional data related.

There's more…

The MongoDB Lookup step is an important step for getting additional data into the stream. A good exercise, if you understand this functionality, is to select customers' names from a hypersonic database and making lookups to MongoDB to bring some additional data into the stream.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset