Using the Pentaho tools to solve ETL challenges
May 07, 2021

Using the Pentaho tools to solve ETL challenges

Anonymous | TrustRadius Reviewer
Score 10 out of 10
Vetted Review
Verified User

Overall Satisfaction with Pentaho

Before working for Hitachi Vantara, I had experience using the Pentaho Tools for personal projects mainly. I had the chance to work directly with the teams that supported the Pentaho tools, and I can tell with much objectivity that the Pentaho tools are by far one of the best options in the market when it comes to all the ETL processes. Data science requires extracting data from different sources, organizing it, and transforming it according to each necessity. Machine learning is built on top of these concepts, and with the Pentaho tools, you can accomplish most of it. Since I supported the Pentaho tools while working for Hitachi Vantara, my perspective is kind of unique; I can tell that the tools were used to solve internal problems such as integrations with our release tools and some of our agile tools, so the tools were used to enhance the newer versions.

Solving problems such as extracting metadata from thousands of files, organizing this information, and filtering it to create release files, determining how to create meta information files is just an example of the ETL cycle that can be performed with the Pentaho tools.
  • Open source, the Pentaho tools have a free to use version with a lot of support.
  • Performance. The Pentaho tools can be setup so they process gigabytes of data seamlessly.
  • Support from the open source developer community.
  • Documentation up-to-date.
  • The web versions of the Pentaho tools are limited to the server component.
  • Worker nodes features are being improved but more documentation and support is always welcome.
  • Being able to successfully extract data as required
  • Able to run complex data transformation from a UI view
Perhaps Snowflake and SalesForce have some components which align with the Pentaho tools. The Pentaho tools have integrations with these technologies to add more value to the final users. Perhaps the only weakness I can honestly find in the Pentaho tools right now is the lack of a powerful web interface for data transformations. There is a web component from which you can access existing data transformations created with the Pentaho Data Integration tool. Still, the web component only allows visualization of the data transformation and remote execution. A complete web interface with remote execution would be excellent, and I'm sure that we might see something like this available at some point in the future.

Do you think Pentaho delivers good value for the price?

Yes

Are you happy with Pentaho's feature set?

Yes

Did Pentaho live up to sales and marketing promises?

I wasn't involved with the selection/purchase process

Did implementation of Pentaho go as expected?

Yes

Would you buy Pentaho again?

Yes

The Pentaho tools are designed so you can start playing around on your own. Of course, you will need guidance at some point, but the training teams are good at guiding new users, and the online documentation is usually pretty up-to-date.

Some of the tools, such as the Pentaho Data Integration tool and the Pentaho Server, are pretty self-explanatory. The other tools maybe are not so quickly and obvious to use, but again, with some documentation and some customer support, you can find your way around them.
Pentaho's customer support team was one of the best I've ever seen, both from the documentation as from the engineering point of view. In cases when a critical issue related to security or performance was found, the multiple developer teams would change priorities to have them fixed as soon as possible with the next release, making our customers and users, in general, happy and satisfied. Sometimes enterprise customers would have some issues which would require some hot-fixing, and in most cases, the problems would be solved diligently.
Any company looking to solve ETL, data science, data mining problems can solve these issues with the Pentaho tools.

From a very close perspective, I know that different industries, such as the leads generation industry, which operates with other sectors such as loans, medical insurance, and mortgage, can benefit significantly from using the Pentaho tools to perform data extraction and refine records.

Other industries such as universities use the Pentaho tools for different kinds of investigations.

Banking industries also use the Pentaho tools to perform internal data mining operations, which, as you can imagine having so much data is challenging.

Perhaps if you are a single developer trying to extract data for some machine learning process, using the Pentaho tools could be a little too much, maybe overcomplicated when some spreadsheets or using R or Python could probably get the job done. Still, complex or straightforward data transformations could suffice for the job.

Pentaho Feature Ratings

Pixel Perfect reports
6
Customizable dashboards
8
Report Formatting Templates
6
Drill-down analysis
7
Formatting capabilities
7
Integration with R or other statistical packages
8
Report sharing and collaboration
7
Publish to Web
7
Publish to PDF
7
Report Versioning
7
Report Delivery Scheduling
7
Delivery to Remote Servers
7
Pre-built visualization formats (heatmaps, scatter plots etc.)
7
Location Analytics / Geographic Visualization
7
Predictive Analytics
7
Multi-User Support (named login)
7
Role-Based Security Model
7
Multiple Access Permission Levels (Create, Read, Delete)
7
Single Sign-On (SSO)
7
REST API
6
Javascript API
6
iFrames
6
Java API
8
Themeable User Interface (UI)
7
Customizable Platform (Open Source)
7