Databricks offers the Databricks Lakehouse Platform (formerly the Unified Analytics Platform), a data science platform and Apache Spark cluster manager. The Databricks Unified Data Service provides a platform for data pipelines, data lakes, and data platforms.
$0.07
Per DBU
OpenText Magellan
Score 9.0 out of 10
N/A
OpenText Magellan Analytics Suite leverages a comprehensive set of data analytics software to identify patterns, relationships and trends through data visualizations and interactive dashboards.
N/A
SSIS
Score 7.6 out of 10
N/A
Microsoft's SQL Server Integration Services (SSIS) is a data integration solution.
SSIS is similar to Alteryx and Informatica PowerCenter in a way because these are all drag-and-drop ETL tools with similar functionality. Alteryx is a step ahead because it has some advanced ETL functionalities including statistical calculations etc. and a better ability to set …
Medium to Large data throughput shops will benefit the most from Databricks Spark processing. Smaller use cases may find the barrier to entry a bit too high for casual use cases. Some of the overhead to kicking off a Spark compute job can actually lead to your workloads taking longer, but past a certain point the performance returns cannot be beat.
If you do not have a large budget and are a large organization, I would steer clear of Actuate. If you are looking to do very complex washboarding, I would not use them. Your developers have to be very skilled to work with this. Plan to bring in consultants if necessary to help your process. Adhoc reporting is weak. If your pricing is user based and you expand, this could be very expensive.
As I mentioned earlier SQL Server Integration Services is suitable if you want to manage data from different applications. It really helps in fetching the data and generating reports. Its automation make it very easy and time efficient. It works well with large database as well. But it doesn't work well with real time data, it will take some time to gather the real time data. I would not recommend using it in a real time/fast-paced environment.
Connection managers for online data sources can be tricky to configure.
Performance tuning is an art form and trialing different data flow task options can be cumbersome. SSIS can do a better job of providing performance data including historical for monitoring.
Mapping destination using OLE DB command is difficult as destination columns are unnamed.
Excel or flat file connections are limited by version and type.
I am no longer working for the company that was using Actuate but I believe they would continue to use it because the stitching costs would be to high. It would require a complete rewrite of the reports and the never version of Actuate (BIRT) even required an almost complete report rewrite
Some features should be revised or improved, some tools (using it with Visual Studio) of the toolbox should be less schematic and somewhat more flexible. Using for example, the CSV data import is still very old-fashioned and if the data format changes it requires a bit of manual labor to accept the new data structure
Because it is an amazing platform for designing experiments and delivering a deep dive analysis that requires execution of highly complex queries, as well as it allows to share the information and insights across the company with their shared workspaces, while keeping it secured.
in terms of graph generation and interaction it could improve their UI and UX
It is quite intuitive to use. It is fit specifically for doing sentiment, emotion, and intention analysis as well as text classification and text summarization. I would have given 10 if it is fit for the purpose of doing image processing and analysis as well. There is a huge market to analyze video and image data.
SSIS is a great tool for most ETL needs. It has the 90% (or more) use cases covered and even in many of the use cases where it is not ideal SSIS can be extended via a .NET language to do the job well in a supportable way for almost any performance workload.
SQL Server Integration Services performance is dependent directly upon the resources provided to the system. In our environment, we allocated 6 nodes of 4 CPUs, 64GB each, running in parallel. Unfortunately, we had to ramp-up to such a robust environment to get the performance to where we needed it. Most of the reports are completed in a reasonable timeframe. However, in the case of slow running reports, it is often difficult if not impossible to cancel the report without killing the report instance or stopping the service.
One of the best customer and technology support that I have ever experienced in my career. You pay for what you get and you get the Rolls Royce. It reminds me of the customer support of SAS in the 2000s when the tools were reaching some limits and their engineer wanted to know more about what we were doing, long before "data science" was even a name. Databricks truly embraces the partnership with their customer and help them on any given challenge.
The support, when necessary, is excellent. But beyond that, it is very rarely necessary because the user community is so large, vibrant and knowledgable, a simple Google query or forum question can answer almost everything you want to know. You can also get prewritten script tasks with a variety of functionality that saves a lot of time.
The implementation may be different in each case, it is important to properly analyze all the existing infrastructure to understand the kind of work needed, the type of software used and the compatibility between these, the features that you want to exploit, to understand what is possible and which ones require integration with third-party tools
The most important differentiating factor for Databricks Lakehouse Platform from these other platforms is support for ACID transactions and the time travel feature. Also, native integration with managed MLflow is a plus. EMR, Cloudera, and Hortonworks are not as optimized when it comes to Spark Job Execution. Other platforms need to be self-managed, which is another huge hassle.
It is vastly superior to these in many ways, for complex reporting it is a much more sophisticated solution. Visualizations are very good. Javascript extensibility is very powerful, others don't support this or as well. Pentaho and MS are both OLAP oriented. Pentaho is moving more toward big data, which was not our primary focus. Others are stuck in the Crystal Reports Band metaphor.
I think SQL Server Integration Services is better suited for on-premises data movement and ADF is more suited for the cloud. Though ADF has more connectors, SQL Server Integration Services is more robust and has better functionality just because it has been around much longer
Actuate can handle 50 to 60 sub reports inside a report very well.
Dynamically creating the datasource, chart, graph, reports are the main advantages. We can do any level of drilling, and can create a performance matrix dashboard efficiently.
Without this, we would have to manually update a spreadsheet of our SQL Server inventory
We would also have poor alerting; if an instance was down we wouldn't know until it was reported by a user
We only have one other person who uses SQL Server Integration Services , he's the expert. It would fall to me without him and I would not enjoy being responsible for it.