Databricks offers the Databricks Lakehouse Platform (formerly the Unified Analytics Platform), a data science platform and Apache Spark cluster manager. The Databricks Unified Data Service provides a platform for data pipelines, data lakes, and data platforms.
$0.07
Per DBU
Google Cloud Dataflow
Score 9.1 out of 10
N/A
Google offers Cloud Dataflow, a managed streaming analytics platform for real-time data insights, fraud detection, and other purposes.
N/A
Pricing
Databricks Data Intelligence Platform
Google Cloud Dataflow
Editions & Modules
Standard
$0.07
Per DBU
Premium
$0.10
Per DBU
Enterprise
$0.13
Per DBU
No answers on this topic
Offerings
Pricing Offerings
Databricks Data Intelligence Platform
Google Cloud Dataflow
Free Trial
No
No
Free/Freemium Version
No
No
Premium Consulting/Integration Services
No
No
Entry-level Setup Fee
No setup fee
No setup fee
Additional Details
—
—
More Pricing Information
Community Pulse
Databricks Data Intelligence Platform
Google Cloud Dataflow
Features
Databricks Data Intelligence Platform
Google Cloud Dataflow
Streaming Analytics
Comparison of Streaming Analytics features of Product A and Product B
Medium to Large data throughput shops will benefit the most from Databricks Spark processing. Smaller use cases may find the barrier to entry a bit too high for casual use cases. Some of the overhead to kicking off a Spark compute job can actually lead to your workloads taking longer, but past a certain point the performance returns cannot be beat.
It is best in cases where you have batch as well as streaming data. Also in some cases where you have batch data right now and in future you will get streaming data. In those cases Dataflow is very good. Also in cases where most of your infra is on GCP. It might not be good when you already are on AWS or Azure. And also you want in-depth control over security and management. Then you can directly use Apache beam over Dataflow.
More templates for Bigquery and App Engine. There is only limited options for templates so the things we use can limit.
I would like native connectors for Excel (XLSX) to reduce the need for custom wrappers in financial pipelines.
Debugging Google Cloud Dataflow using only logs in Cloud Logging can be overwhelming sometimes, and it’s not always obvious which specific element in the flow caused a failure. IT uses a lot of time.
Because it is an amazing platform for designing experiments and delivering a deep dive analysis that requires execution of highly complex queries, as well as it allows to share the information and insights across the company with their shared workspaces, while keeping it secured.
in terms of graph generation and interaction it could improve their UI and UX
It really saved a lot of time and it's flexibility really can give you infra which is future-proof for most of the use cases may it be streaming or batch data. And with this you can avoid use of resource-heavy big data offerings.
One of the best customer and technology support that I have ever experienced in my career. You pay for what you get and you get the Rolls Royce. It reminds me of the customer support of SAS in the 2000s when the tools were reaching some limits and their engineer wanted to know more about what we were doing, long before "data science" was even a name. Databricks truly embraces the partnership with their customer and help them on any given challenge.
The most important differentiating factor for Databricks Lakehouse Platform from these other platforms is support for ACID transactions and the time travel feature. Also, native integration with managed MLflow is a plus. EMR, Cloudera, and Hortonworks are not as optimized when it comes to Spark Job Execution. Other platforms need to be self-managed, which is another huge hassle.