The PDI MongoDB Delete Step

In this recipe, we will cover the functionality of the MongoDB Delete step. This step was developed by Maas Dianto and is open source under the Apache License version 2.0. It is available on GitHub at https://github.com/maasdi/pentaho-mongodb-delete-plugin.

As the name suggests, this step deletes documents from a collection based on conditions defined by the user.

Getting ready

To get ready for this recipe, you will need to start your ETL development environment Spoon, and make sure that you have the MongoDB server running with the data from the previous chapters.

How to do it…

Let's install and use the MongoDB Delete step in a small example by following the next steps:

  1. Now let's install the MongoDB Delete step:
    1. On the menu bar of Spoon, select Help and then Marketplace.
    2. A PDI Marketplace popup will show you the list of plugins available for installation. Search for MongoDB in the Detected Plugins field.
    3. Expand the MongoDB Delete Plugin item, as you can see in the following screenshot:
      How to do it…
    4. Click on the Install this plugin button.
    5. Next, click on the OK button in the alert for restarting Spoon.
    6. Restart Spoon.
  2. Let's delete the order of the Baane Mini Imports customer with priceEach more than or equal to 100:
    1. Using the MongoDB shell, check how many documents exist. Upon running the following query, you should get 20 as the result:
      db.Orders.find({"priceEach":{$gte:100},"customer.name":"Baane Mini Imports"}).count()
    2. In Spoon, create a new transformation with the name delete-mongodb-documents.ktr.
    3. Select the Design tab in the left-hand-side view.
    4. From the Input category folder, find the Generate Rows step and drag and drop it into the working area in the right-hand side view.
    5. Double-click on the step to open the Generate Rows configuration dialog.
    6. Set Step Name to Get Values.
    7. Set the Limit field to 1.
    8. In the Fields table, add the customerName field as a String type with the value Baane Mini Imports. In a new row, add the priceEach field as a Number type with the value 100.
    9. In the Big Data category folder, find the MongoDB Delete step and drag and drop it into the working area in the right-hand-side view.
    10. Connect the Get Values step to the MongoDB Delete step.
    11. Double-click on the step to open the MongoDB Delete configuration dialog.
    12. Select the Delete options tab, click on the Get Dbs button, and select SteelWheels from the Database field. Then, click on the Get collections button and select Orders from the Collection field.
    13. Select the Delete Query tab. In the Mongo document path field, add the priceEach and customer.name fields. The Comparator field for priceEach is >=, and for customer.name, it is =. In incoming field 1, set priceEach and customerName, as you can see in this screenshot:
      How to do it…
    14. Finally, run the transformation with a structure like what is shown in the following screenshot:
      How to do it…
    15. If you run the same query that was executed before, you should get 0 as the result.

How it works…

In this recipe, using the MongoDB Delete step, we delete from the SteelWheels database all the documents in the Orders collection that have the Baane Mini Imports customer name and whose priceEach value is more than or equal to 100. We use the Generate Rows step just to create one row, for testing purposes.

However, PDI gives you the flexibility to read data from different data sources and then apply the rules that you need. For example, you can read customer names from a hypersonic database and then delete them from a MongoDB database. This is a good exercise for you to try.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset