Overview


What is Pentaho Data Integration- Kettle?

Though ETL tools are most frequently used in data warehouses environments, PDI can also be used for other purposes:

•        
 Migrating data between applications or databases
•         Exporting data from databases to flat files
•         Loading data massively into databases
•         Data cleansing
•         Integrating applications

PDI can be used as a standalone application, or it can be used as part of the larger Pentaho Suite. As an ETL tool, it is the most popular open source tool available. PDI supports a vast array of input and output formats, including text files, data sheets, and commercial and free database engines. Moreover, the transformation capabilities of PDI allow you to manipulate data with very few limitations.

New Features in Version  6.0:

 

  1.  Improved system performance monitoring Minor bug fixes to the PDI-specific portions of the Pentaho.
  2. Data profiling enhancements Data Profiling Perspective includes DataCleaner: Analyze Tables and Columns in preparation for ETL.
  3. Deliver data from multiple data sources The Data Services and Kettle JDBC driver enable you to deliver data from multiple data sources, while enriching, cleansing, and transforming the data.
  4. Data movement load balancing PDI provides load balancing of data within transformations and over multiple cluster nodes when using transformation clustering.
  5. Revert changes in job database transactions
  6. Database connections can be used with all jobs. This enables commits and rollbacks on a job level. Prior to this release, this was only possible with transformations.

No comments:

Post a Comment