Databricks offers the Databricks Lakehouse Platform (formerly the Unified Analytics Platform), a data science platform and Apache Spark cluster manager. The Databricks Unified Data Service provides a platform for data pipelines, data lakes, and data platforms.
$0.07
Per DBU
Db2 Big SQL
Score 9.0 out of 10
N/A
IBM offers Db2 Big SQL, an enterprise grade hybrid ANSI-compliant SQL on Hadoop engine, delivering massively parallel processing (MPP) and advanced data query. Big SQL offers a single database connection or query for disparate sources such as HDFS, RDMS, NoSQL databases, object stores and WebHDFS.
Medium to Large data throughput shops will benefit the most from Databricks Spark processing. Smaller use cases may find the barrier to entry a bit too high for casual use cases. Some of the overhead to kicking off a Spark compute job can actually lead to your workloads taking longer, but past a certain point the performance returns cannot be beat.
My recommendation obviously would depend on the application. But I think given the right requirements, IBM DB2 Big SQL is definitely a contender for a database platform. Especially when disparate data and multiple data stores are involved. I like the fact I can use the product to federate my data and make it look like it's all in one place. The engine is high performance and if you desire to use Hadoop, this could be your platform.
Because it is an amazing platform for designing experiments and delivering a deep dive analysis that requires execution of highly complex queries, as well as it allows to share the information and insights across the company with their shared workspaces, while keeping it secured.
in terms of graph generation and interaction it could improve their UI and UX
IBM DB2 is a solid service but hasn't seen much innovation over the past decade. It gets the job done and supports our IT operations across digital so it is fair.
One of the best customer and technology support that I have ever experienced in my career. You pay for what you get and you get the Rolls Royce. It reminds me of the customer support of SAS in the 2000s when the tools were reaching some limits and their engineer wanted to know more about what we were doing, long before "data science" was even a name. Databricks truly embraces the partnership with their customer and help them on any given challenge.
IBM did a good job of supporting us during our evaluation and proof of concept. They were able to provide all necessary guidance, answer questions, help us architect it, etc. We were pleased with the support provided by the vendor. I will caveat and say this support was all before the sale, however, we have a ton of IBM products and they provide the same high level of support for all of them. I didn't see this being any different. I give IBM support two thumbs up!
The most important differentiating factor for Databricks Lakehouse Platform from these other platforms is support for ACID transactions and the time travel feature. Also, native integration with managed MLflow is a plus. EMR, Cloudera, and Hortonworks are not as optimized when it comes to Spark Job Execution. Other platforms need to be self-managed, which is another huge hassle.
MS SQL Server was ruled out given we didn't feel we could collapse environments. We thought of MS-SQL as more of a one for one replacement for Sybase ASE, i.e., server for server. SAP HANA was evaluated and given a big thumbs up but was rejected because the SQL would have to be rewritten at the time (now they have an accelerator so you don't have to). Also, there was a very low adoption rate within the enterprise. IBM DB2 Big SQL was not selected even though technically it achieved high scores, because we could not find readily available talent and low adoption rate within the enterprise (basically no adoption at the time). We ended up selecting Exadata because of the high adoption rate within the enterprise even though technically HANA and Big SQL were superior in our evaluations.