Pentaho Tutorial for Beginners – Learn Pentaho in simple and easy steps starting from basic to advanced concepts with examples including Overview and then. Introduction. The purpose of this tutorial is to provide a comprehensive set of examples for transforming an operational (OLTP) database into a dimensional. mastering data integration (ETL) with pentaho kettle PDI. hands on, real case studies,tips, examples, walk trough a full project from start to end based on.

Author: Mezik Bajora
Country: Czech Republic
Language: English (Spanish)
Genre: Music
Published (Last): 16 April 2013
Pages: 219
PDF File Size: 16.58 Mb
ePub File Size: 16.53 Mb
ISBN: 437-6-18309-426-4
Downloads: 87320
Price: Free* [*Free Regsitration Required]
Uploader: Zuluk

Transformations are used to describe the data flows for ETL such as reading from a source, transforming data and loading it into a target location. Jobs are used to coordinate ETL activities such as defining the flow and dependencies for what order transformations should be run, or prepare for execution by checking conditions such as, “Is my source file available?

This exercise will step you through building your first transformation with Pentaho Data Integration introducing common concepts along the way. The exercise scenario includes a flat file. Several of the customer records are missing postal codes zip codes that must be resolved before loading into the database.

The logic looks like this:. First connect to a repository, then follow the instructions below to retrieve data from a flat file. Click the Fields tab and click Get Fields to retrieve the input fields from pentaoh source file. When the Nr of lines to sample window appears, enter 0 in the field then click OK.


Pentaho Tutorial

After completing Retrieve Data from a Flat Fileyou are ready to add the next step to your transformation. The source file contains several records that kettlf missing postal codes. Use the Filter Rows transformation step to separate out those records so tutorixl you can resolve them in a later exercise. After you resolve missing zip code informationthe last task is to clean up the field layout on your lookup stream.

Cleaning up makes it so that it matches the format and layout of your other stream going to the Write to Database step. Create a Select values step for renaming fields on the stream, removing unnecessary fields, and more. Data Integration provides a number of deployment options.

Data Integration – Kettle | Hitachi Vantara Community

Running a Transformation explains these and other options available for execution. The Run Options window appears. Keep the default Pentaho local option for this exercise.

It will use the native Pentaho engine and run the transformation on your local machine. This tab also indicates whether an error occurred in a transformation step.

Pentaho Data Integration

We did not intentionally put any tktorial in this tutorial so it should run correctly. But, if a mistake had occurred, steps that caused the transformation to fail would be highlighted in red.


The logic looks like this: Retrieving Data from a Flat File First connect to mettle repository, then follow the instructions below to retrieve data from a flat file. You will return to this step later and configure the Send true data to step and Send false data to step settings after adding their target steps to your transformation. Field Setting Connection Name: Sample Data Connection Type: If you get an error when testing your connection, ensure that you have provided the correct settings information as described in the table and that tuttorial sample database is running.

First, you will use a Text file input step to read from the source file, then you will use a Stream lookup step to bring the resolved Postal Codes into the stream. Completing Your Transformation After you resolve missing zip code informationthe last task is to clean up the field layout on your kettl stream. Run Your Transformation Data Integration provides a number of deployment options.