This recipe guides you through starting a Data Integration server and the simple steps required to work with a Data Integration repository. We will add a MongoDB MapReduce transformation to the DI repository and define a data service that runs from the server.
To get ready for this recipe, you first need to start the MongoDB server with the same database as that of the last chapter. You will also have to verify that <user home folder>/.pentaho/metastore
is accessible to Data Integration server.
To run the DI Server, perform the following steps:
ctlscript.sh
script for Unix/Linux operating systems and ctlscript.bat for Windows operating systems in the Pentaho EE suite. This allows you to control the servers packed in the platform. We can start, stop, and restart various servers using this script:<pentaho-installation-path>/
folder../ctlscript.sh help
command to get all the available options for managing the Pentaho suite../ctlscript.sh start
command and all Pentaho services will start. As we mentioned before, it is possible to execute various servers manually using this script. We could have run the postgres server first (needed for the data integration server) and the Data Integration server afterwards using ./ctlscript.sh start postgresql and then ./ctlscript.sh start data-integration-server
.<pentaho-installation-path>/server/data-integration-server/start-pentaho.sh
file for Unix/Linux operating systems and <pentaho-installation-path>/server/data-integration-server/start-pentaho.bat
for Windows operating systems. Even in Windows, you can start the DI server from the Start menu by going to Start | Pentaho Enterprise Edition | Server Management | Start Data Integration Server.http://localhost:9080/pentaho-di
in your web browser. You should get a login page similar to what is shown in this screenshot:admin
as the username and password
as the password.http://localhost:9080/pentaho-di/kettle/listServices
endpoint.chapter1-mongodb-map-reduce-writelog-without-parameter.ktr
file in Spoon, save it as chapter2-mongodb-map-reduce.ktr
, and change the transformation name to MongoDB MapReduce Kettle Thin
.http://localhost:9080/pentaho-di
in the URL property, PentahoMongoDB
in the Name property, and Pentaho MongoDB Cookbook
for the Description property. Then click on the OK button.MapReduceTable
.PentahoMongoDB
repository with the DI server by adding the following XML to the <pentaho-ee-installation-path>/server/data-integration-server/pentaho-solutions/system/kettle/slave-server-config.xml
file inside the slave_config
tag:<slave_config> … <repository> <name>PentahoMongoDB</name> <username>admin</username> <password>password</password> </repository> </slave_config>
<pentaho-ee-installation-path>/design-tools/data-integration/plugins/pentaho-mongodb-plugin/lib/mongo-java-driver-2.13.0.jar
and paste it in the <pentaho-ee-installation-path>/server/data-integration-server/tomcat/webapps/pentaho-di/WEB-INF/lib
folder../ctlscript.sh restart data-integration-server
command.MapReduceTable
service definition by navigating to the http://localhost:9080/pentaho-di/kettle/listServices
endpoint.For Windows operating systems, if you don't see your service, one of the reasons is that you probably have the wrong value for the KETTLE_HOME
variable.
KETTLE_HOME
is the home folder of the .kettle
folder. Inside the latter, you can find configurations for Pentaho Data Integration, for example, the repositories.xml
file. As the DI Server is running as a service over the Administrator user, the KETTLE_HOME
variable has the C:
value by default.
repositories.xml
file from your home user; for example, copy it from C:Users<user home folder>.kettle
epositories.xml
to C:.kettle
epositories.xml
.<pentaho-installation-path>/server/data-integration-server/tomcat/bin
:
tomcat6.exe //US//pentahoDataIntegrationServer ++JvmOptions -DKETTLE_HOME=C:Users<user home folder>.kettle
epositories.xml
The user interface of the DI server looks similar to the Carte server. However, Carte is a lightweight web server based on the Jetty server and doesn't provide enterprise features, such as scheduling jobs or transformations. The DI server is a Tomcat-based server with more capabilities for integration systems, for example, LDAP authentication.
In this recipe, we walked you through the steps for managing the DI server using the ctlscript.sh
script. It's worth noting that it is also possible to use the start-pentaho
and stop-pentaho
scripts from the <pentaho-ee-installation-path>/server/data-integration-server/
folder.