Good integration and orchestration tool, not great for data transformation.
May 26, 2024

Good integration and orchestration tool, not great for data transformation.

Anonymous | TrustRadius Reviewer
Score 4 out of 10
Vetted Review
Verified User

Overall Satisfaction with Matillion

Matillion ETL has been used in my company for a few years to run data ingestion and transformation pipelines that populate the main analytical DWH and send data to other systems like Salesforce or the internal customer support tools. In particular, millions of ingestion components are used to integrate with various external APIs, and Matillion Redshift transformations are used to reshape the data in a data model that makes sense for analytics.
  • Ingestion of data from popular systems (Salesforce, Trustpilot.
  • Orchestration of jobs with complex decision flows (retries, parameterized jobs, conditional flows).
  • Error notifications.
  • Git integration is limited and cumbersome.
  • GUI-based data transformation makes it very hard to apply good engineering practices to data pipelines.
  • Web-based development (there is no offline version) introduces a single point of failure for developers' interaction with the platform.
  • Matillion has been the backbone of my company's analytical functionalities for 10+ years, so it has a good ROI.
  • The price is ok for what our company built with it, but it starts to be less competitive if the tool is not used at its fullest.
Data modeling and SQL knowledge are still heavily recommended when using Matillion because designing a successful and maintainable data transformation requires modeling it with a developer mindset. The data transformation components are often a thin wrapper around SQL concepts, and they don't simplify life that much, IMO. An example is the window component, which cannot facilitate the complex concept of SQL windows just by exposing it to a different interface. On the other hand, low code ingestion is very good and usable, especially for GSheets, Salesforce, and others.
Matillion is pretty intuitive if someone is already familiar with data pipeline concepts. My onboarding lasted a few days, and I was able to be productive with bug fixes and improvements on existing pipelines in a week or so. Designing and building a new pipeline is also very easy and quick, but maintaining it when it becomes complex is pretty complicated and requires much investigation work.
Functionality scalability is good (there are many connectors and supported systems out of the box). It's also easy to create a custom component to interact with a system that is not covered by out-of-the-box connectors. From a performance point of view, my experience with scalability is not good (and tied to the Matillion business model): 1. The maximum parallelism of the running jobs depends on the number of cores of the machine where Matillion is deployed. AFAIK it's only possible to deploy Matillion on a single machine (EC2-like). The license price depends on the number of cores that the machine has. 2. The scalability of the UI is pretty bad (random crashes/slowness), and the number of concurrent open sessions is limited by design (again, pricing-related), even if the sessions belong to the same user.
Both the Databricks platform and dbt Cloud are more powerful from the point of view of the development lifecycle and data use cases covered. They are also more complex and require specialized data engineering skills to be used. Matillion has a lower barrier of entry for small data platforms and simpler use cases. Still, it becomes more complex to manage and use when the use cases become many or complex, and the data platform becomes more sophisticated.

Do you think Matillion delivers good value for the price?

Not sure

Are you happy with Matillion's feature set?

Yes

Did Matillion live up to sales and marketing promises?

I wasn't involved with the selection/purchase process

Did implementation of Matillion go as expected?

Yes

Would you buy Matillion again?

No

Matillion is ok for orchestrating pipelines requiring complex control flows, like retries, parameterized jobs, conditional branches, etc. I have good experience in using it to launch docker containers on Docker Swarm or Kubernetes, delegating the data-heavy lifting to the application running in the container. Another good experience is with integration components offered out of the box: Google Spreadsheet ingestion, Salesforce ingestion reverse ETL and others. These save a lot of development time devoted to implementing the custom connector. Instead, I wouldn't recommend using the data transformation components, which generate SQL code starting from GUI-based configuration: they make it very hard to maintain the transformation logic (find and replace are impossible, DRY is hard).

Matillion Feature Ratings

Connect to traditional data sources
8
Connecto to Big Data and NoSQL
7
Simple transformations
2
Complex transformations
2
Business rules and workflow
7
Collaboration
2
Testing and debugging
4