Pentaho MongoDB Delete Plugin

A Pentaho Data Integration Plugin to delete MongoDB document


Delete by incoming row

The MongoDB Delete transformation step enables you to delete documents inside MongoDB collection. For additional information about MongoDB, see the MongoDB documentation.

Configure connection tab

The Configure connection tab enables you to specify the database and collection to query.

Option Definition
Step name Name of the step as it appears in the transformation workspace.
Host name(s) or IP address(es) The network name or address of MongoDB innstance
Use all replica set members/mongos Differentiates between a replica set containing one node and a stand-alone single Mongo host.
Port Indicates the port number of the MongoDB instance or instances. Specify a default port to use if no port numbers are specified in the Host name(s) or IP address(es) field
Username Indicates the username required to access the database.
Password Indicates the password associated with the provided Username.
Authenticate using Kerberos Indicates whether to use the Kerberos service to manage the authentication process. If you check this, make sure that you enter the Kerberos principal as the Username.
Connection timeout Designates how long to wait for a connection to a database (in milliseconds) before terminating the connection attempt. Leave blank to never terminate the connection.
Socket timeout Designates how long to wait for a write operation (in milliseconds) before terminating the operation. Leave blank to never terminate the operation.

Delete options tab

The Delete Options tab enables you to specify which database and collection you want to retrieve information from. You can also indicate the read preferences and tag sets in this tab

Option Definition
Database Name of the database to retrieve data from. Click Get DBs to populate the drop-down menu with a list of databases on the server.
Collection Name of the collection to retrieve data from. Click Get collections to populate the drop-down menu with a list of collections within the database.
Read preference Indicates which node to read first—Primary, Primary preferred, Secondary, Secondary preferred, or Nearest.

Delete query tab

The Delete query tab enables you to define your criteria of document you want to delete.
You can define your delete criteria in two different ways :

1. Build delete criteria base on available data from previous steps
Option Definition
Use JSON Query Leave it default if you want to criteria base on available data from previous steps for your delete criteria
Mongo document path The key of your MongoDB you want to use as criteria. sample _id, name, age
Comparator The comparator will be use. The value will be: = (equal), <> (not equal), < (less than), <= (less than equal), > (greater than), >= (greater than equal), BETWEEN, IS NULL, IS NOT NULL
Incoming field 1 Field name will be use as value to compare
Incoming field 2 The second field name, this is field need for comparator with two argument. In this case BETWEEN

2. Use JSON Query expression
Option Definition
Use JSON Query Choose this option if you want to use JSON query expression for your delete criteria, when you tick this option the Query will be displayed
Execute for each row Use this option if you want to use substitute field variable. Example: {"_id": "?{id_field}"} the ?{id_field} will replace with value of field id_field on runtime

Resources

Sample Transformation : delete-by-incoming-row.ktr