Overall Satisfaction with Pentaho
- Pentaho Kettle gives you a great graphic user interface to plan your transformation and jobs.
- Pentaho Kettle makes it easy to handle errors, logging and performance.
- Pentaho Kettle has dozen of great steps like: lookup and SCD functionality.
- Several steps have performance issues like the Json input.
- The community edition does not include scheduler and job manager so you need to figure it out yourself, unless of course you buy the Enterprise edition.
- I think that web service should be easier to operate.
I think that you can say informatica is better than both of them but it is way more expensive and the differences are small.
I will summarize it here. You can connect to relational databases like: mssql, mysql, and Oracle. It also connects to files like: txt, csv, xls (even though I don’t recommend it) and more complex files: xml, json. It can connect to API, web services and of course big data like Hive, Cassandra, MongoDB and more including bulk loads. I think Pentaho Kettle is versatile and has 95% of sources you will ever need.
I find it suited for 90% of data integration projects , its a very good tool, easy to use, stable and affordable.
I think that the big data connections are still not perfect, so if you have a NoSQL DBl / Hadoop / Cassandra, you might consider extracting the data to file from the source using MapReduce. Also, if you need bulk load, sometimes it's better to use it directly on a tool, for example Redshift / InfiniDB (that is no longer with us).
Apart than that I think it will suit you well.