Pentaho is a product distributed under an Open Source license that provides a full range of business solutions in the area of Business Intelligence, including reporting, data analysis, dashboards and data integration.
Thanks to the platform Pentaho, individual business units gain access to a wide range of valuable information, ranging from sales and profitability analyzes of individual customers or products, through reporting for the needs of HR and financial departments, to providing aggregate information for the needs of senior management.
The training is addressed to programmers, architects and application administrators who want to create or maintain data extraction, transformation and loading (ETL) processes using Pentaho Data Integration (PDI).
After the training, the participant will acquire skills related to:
- installation and configuration of the environment Pentaho,
- designing, implementing, monitoring, launching and tuning ETL processes,
- working with data in PDI,
- entering various types of data and various data formats
- filtering, grouping and combining data
- task scheduling,
- starting transformation,
- creating clusters.
The course is designed to take the participant from basic to advanced level.
The first day
- Installation and configuration Pentaho Data Integration
- Creating a repository
- Get to know the Spoon user interface
- Creating transformations
- Reading and writing to a file
- Working with databases (query generator SQL)
- Filtering, grouping and combining data
- Working with XLS
Day two
- Creating tasks
- Defining parameters and variables
- Data versioning (support for validity periods)
- Database transactionality in transformations
- Usage JavaScript
- Mapping transformations
- Data type conversion and column order in the stream
- Login processing
Day third
- Running transformations and tasks from the command line (kitchen.bat, pan.bat)
- Task scheduling
- Running transformations in parallel
- Remote startup (carte.bat)
- Clustering and partitioning
- Versioning and group work
Polska - Pentaho Data Integration (PDI) - moduł do przetwarzania danych ETL (poziom zaawansowany)