Overall Satisfaction with Matillion
We use Matillion to load millions of rows of data each and every day. We got our warehouse up and running using Matillion in just a few months. We use it for all our reporting and have built out our commissions processes using this tool. We are able to connect data from multiple sources into one coherent location that makes our reporting so much easier.
- Matillion is an ELT tool rather than an ETL tool. This means that it's using the database engine to manipulate data. Much faster than a traditional ETL tool where speed is lost in the movement of data.
- Building Transformation Jobs. The flow-chart style developer interface makes it simple and easy to build jobs. But underlying that is the ability to create complex SQL and/or Python components which make just about anything you can think of possible.
- AWS integration. The integration with AWS allows a multitude of possibilities such as automatically kicking off jobs when a file is loaded in S3 or sending a notification of job success or failure via SNS.
- Matillion has some limited capabilities when interacting with other databases outside of Redshift and RDS. Getting data from other databases is pretty easy. Putting data back into databases other than Redshift and RDS is more limited. That said, it's specifically built to load Redshift and that it does well.
- There are times when I get disconnected from the server. This may happen once or twice a day. Nothing is lost and it's a simple matter of logging back in. It's not my internet either as my coworkers in different areas of the country experience the same thing. I have seen some improvement in some of the later releases.
- I was able to build a complete commissions processing system for over 3500 employees using Matillion in just over a month. This included gathering data from 5 different sources, combining them together appropriately and building in the business rules. I was awarded one of the top awards our company gives for this project.
- Our big three Business Intelligence tools: Matillion, Redshift and DOMO allow us to give our employees the right data at the right time. Our productivity has increased as 75%+ of our employees look at their data every day they work.
Matillion is easy to use. After the initial configuration, adding data is very simple. There are lots of components that are built to make things simple. A wizard to add data from S3 into Redshift is something I use all the time. Joining jobs together into the main job is just a matter of dragging and dropping them on the page. Scheduling the main job is easy too. This tool has made my job easier and I can sleep at night knowing it's doing its job.
We started out using the AWS medium server and ended up moving to the large server about 6 months into our project. We found having the additional memory helped with some of our timeout issues. We also found that we liked the documentation components you get with the larger server. For our servers, we run them 24x7 and they are pretty solid. I have a small client that I built the processes so that the Matillion server is shut down except for 1 hour of the day when it launches, processes the FTP files it receives and then shuts down. So instead of paying the 24 hours a day they only pay for 1 hour a day. Now depending on your needs, this could be a huge cost savings. What I'm trying to say is you can set it up to the scale that you need.
We looked at several different options including Talend, SSIS, and a homegrown Python application. One of the biggest issues we were trying to address is the ability to schedule, monitor and have total visibility of what was going on. After review, Matillion came out as the clear winner. It wasn't the cheapest but it did what we needed it to do.
If you have a Redshift database this tool is specifically built for you. It allows you to automate the loading of the data warehouse easily. It's scheduling ability allows you to time the load when you want and it's notification ability allows you to make sure it has loaded successfully. It translates your jobs into SQL that is run within the Redshift environment which makes it very fast. Tracking of the jobs and logging is quite helpful when tracing any issues that might have come up.