A good tool for your ETL needs, keep an eye on the bill and explore valid alternatives.
May 22, 2024

A good tool for your ETL needs, keep an eye on the bill and explore valid alternatives.

Anonymous | TrustRadius Reviewer
Score 7 out of 10
Vetted Review
Verified User

Overall Satisfaction with Matillion

We use Matillion mainly as an orchestration tool to stage data from multiple sources and some light transformation. We also use it to export data to S3 leveraging Snowflake.
The GUI is intuitive, and the web interface helps to be up and running very quickly.
We have some issue related to the resources needed for some jobs, there's no visibility of the system resources used or auto-balancing of some activity to avoid the server to crash.
We don't like the billing mechanism being a cost based on the server CPUS because we host the server so we are already paying for it.
We would prefer a billing mechanism decoupled from the server resources for many reasons including the fact that it crashes on some jobs due to memory issue and upgrading the server would double the bill instantly for the same jobs. It doesn't scale naturally.
Few years ago when we started using it, it was a great player in the cloud ELT world, today it is suffering that while the interface is web, the engine itself is still monolithic and static, hard to migrate and move to a new machine.

We will be looking for other tools that have the creation of the data workflows and the actual scheduling/execution decoupled so that you can use a central hub to plan and create the logic and then decide in which region/sesrver to run them without having to worry about a full server being installed in every region.

Docker/kubernetes comes to mind, but implemented and managed effortlessly behind the scenes. We don't want to deal with it, just use the tool.
  • Web interface is good enough
  • Set of built in components available for orchestration/transformation
  • Integration with target database (Snowflake for us)
  • Static and monolithic, it will show its limits when running multiple concurrent jobs.
  • Github and versioning implementation is messy and broken. Don't use it.
  • There's not way to see/query the system resources, just wait for a server to crash due to out of memory. An admin panel would be appreciated + some env variables with updated info.
  • API implementation is cumbersome and limited.
  • There's no concept of hub and worker engine, everything happens of the same server (designing workflows and executing them). Having separate light ETL engines to run job could be better. (sort of docker/kubernetes/lambda functions).
  • Handling of variables is limited especially for returned values from sub components.
  • Some components could return more metadata at the end of their execution instead of the standard one.
  • Billing is badly designed not taking into account that the server is hosted by the client. Expensive.
  • We had several issue with migration where starting a new instance was required and then migrating the content. It was painful and time consuming also have to deal with support and engineering team on Matillion side.
  • CDC doesn't work as expected or it is not a mature product yet.
  • Ability to have embedded analytics covering multiple systems just in one place
  • Hassle free data movement
  • CDC doesn't work properly so real time data is not an option
  • Dedicated team to handle server maintainance
  • When something goes wrong on a server side (server crash) investigating is slow and painful
Integration with the team (git integration and versioning) is bad and it is not helping.
Learning curve is steep but fair.
Once understood what the tool has been designed for and how much it relies on the target DWH (being a ELT more than a ETL) things get easier.

Moving data from a simple db to the DWH could be achieved in a few days of learning, starting to add some logic or transformation will take some months to master, including handling of variables and how they behave.

You will be tempted to use the python scripting component instead of the builtin components and it can be handy but also keeping you from using the full potential.

Some components, in order to remove the complexity of the task ended up being complex on their own. This layer means that you have to learn how the component works more than the source task challenges.
It doesn't scale. it is flexible in what it does and tries to do it smartly but that's it.
Removes most of the complexity around setting up and preparing things.
If you could describe with words what needs to be done to move data from A to B, the implementation in Matillion would probably be the most similar in terms of simplicity of understanding what you are doing and how.

Do you think Matillion delivers good value for the price?

No

Are you happy with Matillion's feature set?

Yes

Did Matillion live up to sales and marketing promises?

I wasn't involved with the selection/purchase process

Did implementation of Matillion go as expected?

No

Would you buy Matillion again?

No

If the target dwh is one of the big players it could be a good option.
Can retrieve data from multiple different sources and handle them internally.
Expensive and being hosted by the client there's also the infrastructure burden of maintaining/paying for the server.

Considering the resources needed hence the license cost that scales with them (despite the fact that you host and pay for them already) I wouldn't suggest the tool to a small company and, once you are big enough you probably want to jump on bigger more mature tools.

Matillion is a nice niche player with some nice to have feature that are probably suited for a mid-size company with some money available to pay for the license but still a small infrastructure that just require one/two Matillion servers installed.

A global company with multi regions needs will drown under the burden of handling/updating/maintaining all the servers independently and pay for the cost of each one of them.

Matillion Feature Ratings

Connect to traditional data sources
7
Connecto to Big Data and NoSQL
6
Simple transformations
7
Complex transformations
6
Business rules and workflow
7
Collaboration
2
Testing and debugging
3