ETL processing you can set up once and forget about
November 06, 2015
ETL processing you can set up once and forget about
Score 8 out of 10
Overall Satisfaction with ActiveBatch
It was and still is being managed and integrated into other internal systems by the whole IT DevOps team, but forms a critical layer that services our entire organization in providing automated linkage between my former company's systems and those of their clients. On any given week the company would have literally hundreds of file import/export transactions that need to happen on a single day, every day, every month, or some complex combination of days at different hours of the day and night and consistently ActiveBatch delivered. It allowed for the efficient design and redesign of import/export jobs (such as being able to reuse schedule objects, allow for the partial execution of nodes for a particular job to help with testing, and effortlessly change parameters from testing to development deployments), logging of transaction errors, and ability to work with the wide variety of file processing tools that we would use internally (ETL, SSIS packages, .NET code, etc.) It's relatively intuitive design allows for it to be used and managed by people with even a bare minimum of IT experience, without sacrificing power and reliability.
- One good feature I already mentioned is that once you create a configuration object (such as a schedule object), you can reuse it as much as needed. This minimizes errors in scheduling because there's less opportunity to make a configuration error with future similar jobs that would follow the same schedule, makes it more efficient to schedule those new jobs, and makes it efficient to make scheduling changes--change the scheduling object once and all the related jobs' schedules are automatically changed.
- Partial path execution is a good strength, especially for testing/debugging. I can have a decently sized tree of process nodes for a given job, but I can easily deactivate the nodes I do not want to include in my testing.
- Being able to define test and production jobs in separate environments and easily change the settings of one without affecting the other is another strength. Often I would have a job set up in a test environment and after testing it, it would get ported to a very similar production environment with only having to change 3-4 parameter settings. And then I could easily deactivate the test job while still keeping it all set up in the event of needing it for future enhancement/troubleshooting.
- While I like being able to reuse scheduling objects and the like, more work needs to be done to help one not reinvent the same scheduling object without realizing it and to then find scheduling objects that have similar schedules even if they are worded slightly differently than what I'm expecting. It needs to be "smarter". It was easy to accumulate a pool of scheduling objects that while named differently, had exactly the same schedule. It was also hard to sift through to find the little differences between similarly named scheduling objects.
- The logger had a clean enough interface but it could be more legible and offer contextual help to describe the messages one is reading. I remember trying to read black text on a medium gray background with Courier size 10-11 font. Not so easy to read quickly and to parse through the relevant parts. I think some selective color coding would be good and links to message definitions or any form of further information would be nice. Maybe the ability to export the log file to various formats would also be helpful.
- I don't remember a dashboard that at a glance on the top level would highlight what jobs failed completely and which jobs might have warnings or non-critical errors. I got emails because I defined to get them. Maybe again if there was a way to color-code the type of error would be good nice-to-have.
- Definitely increased ETL tool testing and deployment efficiency for the IT DevOps staff.
- Troubleshooting and retesting of problems with using ETL tools is enhanced, but there's definitely room for improvement there. Emails and logging helped but didn't really provide any additional help to those provided by GlobalScape and Microsoft. So a positive, but not a strong positive there.
- For as much data that it manages coming in and out, the errors that came up were certainly important to deal with very quickly and this product could make that aspect better, the errors were few and far between and almost always the errors had nothing to do with ActiveBatch itself (usually with the quality of the data it was handling)--so in that sense, it brought visibility to areas of quality improvement to be made between the company and the clients.
N/A - It was already in place when I was on the scene, but like I said earlier it is much more powerful than SQL Server Agent and probably anything we would've come up with from scratch using .Net. However if your needs are small and traffic is light, then maybe SQL Server Agent or something smaller and less powerful (and less expensive) than ActiveBatch would work just fine.
I used to work in a company that only used SQL Server Agent to handle imports and exports of ETL data. ActiveBatch is far more powerful and easy to use, so I definitely would recommend it. I would think of three questions: (1) Do you handle a large volume of exports and imports in a given week? (2) Do you need a lot of configurable options, such as with scheduling? (3) Do you utilize a variety of ETL processing tools (such as GlobalScape EFT and Microsoft SSIS) but want one tool to work with them all? If yes to those questions, then ActiveBatch would well fit the bill for you then.