Matillion has made complex ETL of massive quantities of variably structured data approachable for us.
April 18, 2019

Matillion has made complex ETL of massive quantities of variably structured data approachable for us.

Edward Hunter | TrustRadius Reviewer
Score 8 out of 10
Vetted Review
Verified User

Overall Satisfaction with Matillion

Our company, Clutch, ingests, transforms and otherwise manipulates massive quantities of retail customer data every day. Because even retailers in the same verticals each bear different processes, structures and even attributes within their data, we needed a tool to help us rapidly develop ETL and we needed one tailored to working with advanced columnar data and storage, such as Amazon's Redshift column store database and S3 cloud data storage. While we're cost sensitive, we also didn't want to reinvent the wheel.

More importantly, we wanted to avoid a cloud-based product that tried to act like traditional enterprise ETL tools like Informatica or SSIS, only to find ourselves using clunky, complicated and undiscoverable browser-based tools.

Finally, we are a shop that often 'fixes tires on a moving car', and so the tool we chose needed to be intuitive, the support behind it responsive and amazing. Matillion has served these needs and more, quickly becoming a go-to part of most of our data solutions, and becoming increasingly visible in other areas as well, such as reporting on cloud-based systems like Salesforce and JIRA.
  • Matillion (for Redshift) understands working with Redshift, and it's components adapt well to changing aspects of extraction, transformation, and loading.
  • Because Matillion is pay for actual use, we only pay for the uptime of our instance. This is way more powerful than we expected, as we're able to maintain task-specific Matillion instances that are brought online to handle workloads, then taken down when not in use.
  • Matillion's support is probably the best there is, for any software. You're never asked the proverbial 'is the machine on' type questions, they always get back to you faster than expected, and the people conducting the support are definitely not reading scripts. They know what you're trying to do because they've probably done it.
  • Matillion often surprises us with it's flexibility and adaptability to tasks, even some you'd normally consider being outside the purview of traditional ETL.
  • While ETL usually involves fetching, moving, changing and loading/unloading data, sometimes destroying it is required, but Matillion is understandably light in these capabilities. For example, when we process data arriving via SFTP, once we're done, we like to move the files in question to somewhere indicating so, like a processed area, or even a failed area. It would be great if Matillion offered a way to do things like this, even if not a 'delete file', perhaps a 'move file' where you must provide a source/destination.
  • A way to flush queued tasks without rebooting.
  • Better task logging/communication features e.g. alerts, etc.
  • At first, we thought Matillion was too expensive until we realized and implemented the platforms ability to be deployed and used when needed. This has not only delivered solid ROI, but it's allowed us to start understanding ETL processes and heavy lifting as cost centers that can be attached to a project.
  • Much of our ETL was previously being addressed by custom code, which limited our agility. Matillion has turned that around considerably.
  • Because of its tight integration with Redshift and AWS, we can now explore, understand and ingest data from our customers much faster than before.
Matillion has a bright future ahead, and the sky is the limit for where it can improve and innovate. Though I have many years of using enterprise level ETL tools, many of my colleagues do not, and I've been able to get them up to speed on the platform faster than I ever expected.
We were able to get Matillion up and running within days, and this was important because I needed to be able to sell the notion of this tool internally. To do this, I needed to emulate Matillion playing the role of a complex incumbent process in such a way that I could communicate and demonstrate it to key stakeholders in less than a month. In less than a week, our DevOps team provisioned the instance, and we hammered out a few basic processes to fetch data, transform it and load it. Over the course of the next couple weeks, we took the time to really explore the components and naturally we had questions. Matillion support had answers, often the same day. Finally, as the POC with the internal stakeholders approached, we generated fantastic looking (though very large in size ) documentation about our newly developed processes that we shared. Ultimately it was a pretty easy sell.
This came as a surprise to us because if you ran Matillion 24/7 it would be pricey, let alone if you needed to scale to more than one instance. However, because you can have your instance on or offline when you need it, not only can you control those costs, but you can spin up as many instances at whatever size you need for the job at hand, then take them down when the job is done.
Matillion is great for quickly hammering out ETL processes and getting them up and running. It could be better at 'productionalizing' though, perhaps by bundling alerts and communication into its scheduling capabilities. While Matillion works well with AWS resources like Redshift, S3, and SQS, being able to integrate things like S3 triggers and Lambda functions would allow less development of custom processes that occur outside the platform.

Matillion being a pay-while-using platform makes it ideal for compartmentalizing processes. If you want to create cost centers around the heavy lifting involved in a particular project, you can have a specific Matillion instance address the needs by spinning instances up and down when needed. This tasking and scheduling however, must largely be driven by utilities outside Matillion. But, while Matillion support and blogs are super informative and ready to deliver solutions to implementing things like this, it would be great if the platform itself integrated with things like Cloudwatch events to help schedule and manage its uptime, downtime and operating costs.

Matillion Feature Ratings

Connect to traditional data sources
9
Connecto to Big Data and NoSQL
7
Simple transformations
6
Complex transformations
6
Metadata management
7
Business rules and workflow
8
Collaboration
9
Testing and debugging
6
Integration with data quality tools
Not Rated
Integration with MDM tools
Not Rated