Databricks Lakehouse Platform Reviews
Databricks Lakehouse Platform

Databricks Lakehouse PlatformFormerly Databricks Unified Analytics PlatformPricing

Databricks Lakehouse Platform Pricing Overview

Databricks Lakehouse Platform has 3 pricing edition(s), from $0.07 to $0.13. Look at different pricing editions below and read more information about the product here to see which one is right for you.

Standard

$0.07

Cloud
Per DBU

Premium

$0.10

Cloud
Per DBU

Enterprise

$0.13

Cloud
Per DBU
Pricing for Databricks Lakehouse Platform

Offerings

  • Does not haveFree Trial
  • Does not haveFree/Freemium Version
  • Does not havePremium Consulting / Integration Services

Entry-level set up fee?

  • No setup fee

What TrustRadius Research Says

Databricks Pricing 2022

Data drives your business intelligence and all important decision-making, but managing the extensive workflows requires robust software. Databricks provides an analytics platform for data scientists with fast access to all their data in one place. You can query and scale at ease and spend more time on analysis.

What is Databricks?

Databricks is a data analytics platform that can be used on the big three cloud computing vendors - Amazon Web Services, Microsoft Azure, and Google Cloud Platform. Their main product is the Databricks Lakehouse Platform. The software is both a data warehouse and data lake which means it stores raw data that can be analyzed.


When it comes to functionality it can be used for a wide range of data analysis needs. Data science, data engineering, machine learning, and business intelligence are all possibilities for your team. You can manage the data lifecycle for your entire team. Your data scientists can work with big data and data analysts can interpret the metrics so your company can grow from those insights. The efficiency in the design of the platform is the fact you can access all your real-time metrics.


The use case for the analytics platform tends to be finance, healthcare, manufacturing, retail, and IT teams. While some small companies do use Databricks they are predominantly utilized by enterprises with 1000+ employees.


Their software is also available in a free and open-source version called the Databricks Community Edition. This version works with the Apache Spark platform. They are a robust engine for data engineering. If you’re not able to afford the high costs of data science software then the Community Edition may be the best choice, especially if you’re a startup.


The community version is hosted on AWS but does not incur costs for the service itself (you will be charged for applications you use with it). That way you only have to pay for what you really need. The paid version of Databricks on cloud vendors is also not available on-premises.


Open source software projects are of high importance to the Databricks team. They contribute to a number of open source projects like Delta Lake, Apache Spark, and MLflow. You can find a full list of their open-source projects here.


Databricks also has their own engine called Photon. It’s a high-speed query engine compatible with Apache Spark API and adaptable to the code of your raw data. For more details see Photon’s documentation here.


The paid versions will offer some or all of the three Databricks plans Standard, Premium, and Enterprise. The features of the plans are unique to Databricks and stay the same for each cloud vendor.


The Standard version of Databricks has access to MLflow, Delta Lake, and Apache Spark dashboards. You can also use connectors and integrations. Teams won’t be able to access the Databricks workspace, SQL optimization, or use autoscaling. In terms of administrative features, they have an administrator console but not much else for governance. Security is single-sign-on (SSO), but it does not have more advanced features like role-based access control.


Premium comes with all features (excluding add-ons depending on the version). Only the AWS version lacks a couple of security features like compliance protocols but that is because AWS wants to make security exclusive to their Enterprise plan. More on this in the pricing section below.


The plan is very feature-rich, offering the Databricks workspace, SQL optimizations as well as all performance and governance functionality.


The Enterprise plan is an outlier because GCP and Microsoft Azure don’t offer it. This is because it has all the same features as Premium. The one difference is that in AWS, Enterprise is the only plan that can have certain add-ons like extra security. With other cloud vendors, Premium becomes the one that is eligible for additional features.


If you want to learn more about Databricks check out their FAQs section here. Below is also a video that goes over the history of Databricks.


What is Databricks? The Data Lakehouse You've Never Heard Of

How Much Does Databricks cost?

Similar to most cloud computing applications the pricing details for this software is based on rates for charging your exact usage and is not a subscription with set costs. The rates per specific service and version vary greatly depending on the cloud service.


Each cloud provider of Databricks has different plans, compute options, instances, and add-ons available. Their differences in cost are in the hundreds as a result. All the cloud computing platforms use Databricks units (DBU) to calculate your runtime. The DBU is processing per hour, billed at each second of the hour.

AWS Databricks Pricing

The super frustrating part about being given rates rather than set monthly costs is that you need to calculate the cost yourself. The bright side is that for most pay-as-you-go services there will be a pricing calculator to help you make pricing estimates.


Estimates are not set in stone and don’t take into account any extra fees. It’s also important to note that when you use cloud applications like Databricks Lakehouse, it's also common to use them with other services from your preferred cloud provider.



Rates for AWS Databricks are based on 3 plans Standard, Premium, and Enterprise, which all have different rates for different compute services. The full list of AWS Databricks compute options is below. As discussed previously, Photon compute options refer to using the Databricks next generation engine.


AWS Compute Options

Jobs Light Compute

Jobs Compute / Jobs Compute Photon

Delta Live Tables / Delta Live Tables Photon

SQL Compute

All-Purpose Compute / All-Purpose Compute Photon

Serverless SQL Compute


The Jobs Light Compute is for operating data engineering pipelines so you can manage your data lakes, while the Job Compute version is scaled for more robust workloads. The Job Light Compute is the equivalent to the open source version by Apache Spark.


For SQL compute you can interact with your data through queries and visualization, and in turn, make decisions with that intelligence. The difference with Serverless SQL Compute is that it is fast and hosted by AWS.


With Delta Live Tables, teams can access fast transformation of data into pipelines through Python or SQL coding. It’s an extract transform load (ETL) which transforms the data before loading. Extract load transform (ELT) systems transform the data later which is a longer process. For more about the Delta Live Tables go here.


The All-Purpose Compute allows data science teams the power to go deeper with data analytics for business intelligence using machine learning. AWS also offers Enhanced Security and Compliance add-on that is only offered in the Enterprise plan. The cost is 15% of your product spend.


We aren’t recreating their entire table or usage rates list because it would be better for you and your team to look through those costs directly. You can find the AWS pricing here and the pricing calculator with a list of AWS instance rates here. There is also a helpful FAQs section on the AWS pricing page that answers questions about billing and support.


One of the best things you can do considering all the cost rates is using the pricing calculator AWS provides on the Databricks website. It’s not the same experience as using the AWS pricing calculator; it's actually simpler with fewer input fields.


The estimate depicted below inputs a runtime of 24 hours a day 30 days a week for 2 instances, with all settings on default for the Standard plan. The compute type that you have for default is All-Purpose Compute and the instance is m5d.large.


With that information, it comes to $196 per month, but again this doesn’t factor in extra fees or other AWS services you use. You can also pay for more than one compute option, and estimate the total by adding on more.



Your costs will certainly be different depending on the compute type you want and the instance you need. The image below shows the different instances available and if you input the compute type it will also tell you the different rates for each one (it will not show any rates without knowing the compute type).



The best way to get a closer estimate of your needs may be by contacting sales. You can fill out a contact form here.


Below is a video for how to get started with AWS Databricks.


Deploying Databricks on the AWS Marketplace

Microsoft Azure Databricks Pricing

Microsoft Azure offers different pricing options than the AWS Databricks version. Only the Standard and Premium plans are available, and the compute options do not have Jobs light Compute.


Part of the reason why Jobs Light Compute isn’t offered is that it's the same as the community edition of Databricks with Apache Spark, but Azure Databricks already works with Apache Spark directly. As discussed previously, Photon compute options refer to using the Databricks next-generation engine.


Azure has two pricing lists, one you can find on Databricks website, and one on their own website. Their own pricing is incredibly more detailed and discusses an option for pre-paying for the year so it's 100% worth going through.


Azure Databricks Compute Options

Jobs Compute / Jobs Compute Photon

Delta Live Tables / Delta Live Tables Photon

SQL Compute

All-Purpose Compute / All-Purpose Compute Photon

Serverless SQL Compute


Something that’s important to understand is that the version of Azure Databricks is not the same as AWS or GCP. You might think of the same product from a different retailer, but it’s actually a different product from a different retailer. In Azure’s case, the instances for Databricks are virtual machines (VM) instead of regular cloud server environments. The cost of both your databricks runtime and VM can get very high.


In terms of a pricing calculator, Microsoft Azure only has their main one and does not offer a specified version on the databricks website. This does mean you will need to visit their calculator, look up Databricks, and input your usage information from there. You can get started here.


When you do begin your estimate for Azure Databricks you will find it is already pre-populated and set to the Premium plan. For the estimate depicted below, we chose Standard but kept all other defaults for compute and instances. The estimator is already set to 730 hours or one month of usage. The total, without adding fees, taxes, or other Azure applications you use, is $422.67 a month.



If you want a more tailored quote you can contact them for better pricing here. Below is a tutorial that can show what using Azure Databricks is like.


Azure Databricks Tutorial | Data transformations at scale

GCP Databricks Pricing

Similar to Azure Databricks, Google Cloud Databricks only has Standard and Premium plans. Unlike the other cloud services, GCP’s version includes a compute option called DLT Advanced Compute / DLT Advance Compute Photon. The DLT Advanced Compute is an upgraded version of the Delta Live Tables. Reminder that Photon compute options as discussed previously refer to using the Databricks next generation engine.


GCP Databricks Compute Options

Jobs Compute / Jobs Compute Photon

SQL Compute

DLT Advanced Compute / DLT Advanced Compute Photon

All-Purpose Compute / All-Purpose Compute Photon


The security option they have is the GCP HIPAA Compliance add-on for Premium users and is only 10% of product spend.


You can find GCP’s pricing page on Databricks here and their pricing calculator on Databricks here. The cost difference between them and other cloud services comes down to the different instances that GCP offers. You will find that GCPs instances for All-Purpose Compute cost more than AWS instances. GCP also tends to give far more GB memory for each instance which may explain the jump in cost.


GCP’s pricing calculator is available on Databricks website and is simplified like AWS’ version. The setting starts and Premium but for the estimate depicted below it was brought down to Standard but with all other defaults kept the same. The runtime was entered as 24 hours and 30 days. The total for each month comes to $408.96.



For more insight and transparency on the costs see the contact page for pricing here. Below is a video to help you get started with GCP Databricks.


Getting Started with Databricks on Google Cloud Platform

What is an Alternative to Databricks?

One great alternative to Databricks is Apache Spark. Besides licensing the Databricks Community Edition, Apache Spark is a powerful engine for enterprise-level data analytics. They are a free, ready-to-download resource. Teams can use them for running SQL queries, real-time processing, and developing machine learning algorithms.


In comparison to the paid versions of Databricks, Apache Spark offers on-premises control over your data. It’s also a highly integrative service that works with the most popular cloud vendors and other powerful software including Databricks.


To see a comparison of the two software with end-user-provided feedback you can go here. The similarity users love is the ability to handle large amounts of data and querying options to access it easily.

More Resources

If you want to see more software platforms you’re in luck. We have several related software categories to Databricks. Some software categories that may apply to you include:


Data lakehouse software

Machine learning software

Data science software

Data preparation software


For those that have used any of the platforms discussed here, please leave a review to help other buyers make informed decisions.