43 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow noopener noreferrer'>trScore algorithm: Learn more.</a>
Score 8.8 out of 100
24 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow noopener noreferrer'>trScore algorithm: Learn more.</a>
Score 8.5 out of 100

Highlights

Databricks and Amazon EMR (Elastic MapReduce) are solutions for processing big data workloads. Both tend to be deployed at larger enterprises. Databricks handles data ingestion, data pipeline engineering, and ML/data science with its collaborative workbook for writing in R, Python, etc. Amazon EMR allows users rely on multiple open-source tools such as Apache SparkApache HiveHBase, or Presto, to integrate and process big data workloads more simply.

Features

Databricks and Amazon EMR boast distinct advantages for processing big data workloads. 

Amazon EMR/Elastic MapReduce is described as ideal when managing big data housed in multiple open-source tools such as Apache Hadoop or Spark. Users state that relative to other big data processing tools it is simple to use, and AWS pricing is very simple and appealing compared to competitors. It is secure, scalable, and highly available for a cloud service.

Databricks is praised for its core competencies; its data science notebook is better than alternatives (e.g. Jupyter Notebook) for enabling flexible and fast analysis on massive amounts of data while swapping between work in SQL, R, Scala, Python. Its open-source community documentation, available to all, is well regarded. And because the Databricks Community Edition is free and open-source, it is one of the relatively few options that presents a lower cost solution than Amazon EMR, though for the right users, and use cases.

Limitations

Users remark on similar limitations when considering Databricks and Amazon EMR for big data.

Amazon EMR is not a fast processor and shines primarily where users need a simplified framework for managing data from multiple tools. Also, particularly when compared to Databricks, the Amazon workbook and its machine learning capabilities are not as mature.

The licensed edition of Databricks is costly, as is its certification cost. Additionally, Databricks can be hard to use for non-technical users, who say its in-app help is unclear and hard to use. And a few say Databricks lacks good visualizations for displaying work.

Pricing

Databricks is available open-source and free via its community edition, or through its Enterprise Cloud editions, on Azure or AWS. Pricing can be complex.

Azure Databricks “Databricks Units” are priced on workload type (Data Engineering, Data Engineering Light, or Data Analytics) and service tier: Standard vs. Premium. Premium adds authentication, access features, and audit log. The Data Analytics workload is $.40 per DBU hour ($.55 premium tier) and includes data prep and data science notebook. The Data Engineering tier includes data pipeline and workload processing, for $.15 per DBU hour ($.30 Premium tier). Data Engineering Light is $.07 per DBU hour ($.22 Premium tier) and only allows users to run jobs.

Databricks AWS is also priced based on service tier (Standard, Premium, Enterprise) and workload type. Higher service tiers add Optimized Autoscaling, role-based access, federated IAM, HIPAA compliant storage, access lists for audit, and customer-managed keys. The Jobs Compute workload allows users to run data engineering pipelines and manage & clean data lakes (priced $.07, $.10, .$13 per service tier). The All-Purpose Compute service ($.40, $.55, $.65) is fully featured.

Amazon EMR is available from AWS, and is priced simply on a per-second rate for every second used with a one-minute minimum. Its hourly rate depends on instance type (e.g. standard, high CPU, high memory, high storage), with present price ranging from $0.011/hour to $0.27/hour. Amazon EMR is also available as an add-on service for Amazon EC2, and is available reserved, on-demand, or on lower-cost Spot Instances (i.e. AWS’s discounted service using EC2’s unused capacity). Pricing still falls within range of .011 to .27 per hour.

Likelihood to Recommend

Amazon EMR

Amazon Elastic MapReduce is useful in cases where two conditions are met. First, that you are planning on using multiple big data tools simultaneously to analyze big data sets. And second, that you need a tool that simplifies managing big data tools. If these two conditions are met, MapReduce does a great job. The user interface is simple. The program eliminates some programming requirements. The software also makes setting up big data analyses much easier. With these benefits acknowledged, MapReduce is not a good tool for "small" data analyses, given that there are other tools that do the job quicker and much more professional output. If you're on the fence, try out MapReduce with competing "small" data tools and see if you really need big data software.
Thomas Young | TrustRadius Reviewer

Databricks Unified Analytics Platform

Databricks has helped my teams write PySpark and Spark SQL jobs and test them out before formally integrating them in Spark jobs. Through Databricks we can create parquet and JSON output files. Datamodelers and scientists who are not very good with coding can get good insight into the data using the notebooks that can be developed by the engineers.
Anonymous | TrustRadius Reviewer

Feature Rating Comparison

Platform Connectivity

Amazon EMR
Databricks Unified Analytics Platform
8.3
Connect to Multiple Data Sources
Amazon EMR
Databricks Unified Analytics Platform
9.0
Extend Existing Data Sources
Amazon EMR
Databricks Unified Analytics Platform
9.0
Automatic Data Format Detection
Amazon EMR
Databricks Unified Analytics Platform
7.0

Data Exploration

Amazon EMR
Databricks Unified Analytics Platform
6.0
Visualization
Amazon EMR
Databricks Unified Analytics Platform
6.0
Interactive Data Analysis
Amazon EMR
Databricks Unified Analytics Platform
6.0

Data Preparation

Amazon EMR
Databricks Unified Analytics Platform
8.0
Interactive Data Cleaning and Enrichment
Amazon EMR
Databricks Unified Analytics Platform
8.0
Data Transformations
Amazon EMR
Databricks Unified Analytics Platform
9.0
Data Encryption
Amazon EMR
Databricks Unified Analytics Platform
7.0
Built-in Processors
Amazon EMR
Databricks Unified Analytics Platform
8.0

Platform Data Modeling

Amazon EMR
Databricks Unified Analytics Platform
8.3
Multiple Model Development Languages and Tools
Amazon EMR
Databricks Unified Analytics Platform
9.0
Automated Machine Learning
Amazon EMR
Databricks Unified Analytics Platform
8.0
Single platform for multiple model development
Amazon EMR
Databricks Unified Analytics Platform
9.0
Self-Service Model Delivery
Amazon EMR
Databricks Unified Analytics Platform
7.0

Model Deployment

Amazon EMR
Databricks Unified Analytics Platform
7.5
Flexible Model Publishing Options
Amazon EMR
Databricks Unified Analytics Platform
7.0
Security, Governance, and Cost Controls
Amazon EMR
Databricks Unified Analytics Platform
8.0

Pros

Amazon EMR

  • Easier to implement than older on-premise solutions
  • Works with open source technologies.
  • Keeps processing cost low.
  • It is flexible and works also for short term workloads and the pricing changes to that model.
Nicolas Costa Ossa | TrustRadius Reviewer

Databricks Unified Analytics Platform

  • Extremely Flexible in Data Scenarios
  • Fantastic Performance
  • DB is always updating the system so we can have latest features.
Anonymous | TrustRadius Reviewer

Cons

Amazon EMR

  • It could have been more matured with machine learning capabilities.
  • The support material available online on Elastic MapReduce is limited and we might end up spending more time in understanding/researching the tool.
Anonymous | TrustRadius Reviewer

Databricks Unified Analytics Platform

  • The navigation through which one would create a workspace is a bit confusing at first. It takes a couple minutes to figure out how to create a folder and upload files since it is not the same as traditional file systems such as box.com
  • Also, when you create a table, if you forgot to copy the link where the table is stored, it is hard to relocate it. Most of the time I would have to delete the table and re-created.
Ann Le | TrustRadius Reviewer

Usability

Amazon EMR

Amazon EMR 8.3
Based on 4 answers
I give Amazon EMR this rating because while it is great at simplifying running big data frameworks, providing the Amazon EMR highlights, product details, and pricing information, and analyzing vast amounts of data, it can be run slow, freeze and glitch sometimes. So overall Amazon EMR is pretty good to use other than some basic issues.
Anonymous | TrustRadius Reviewer

Databricks Unified Analytics Platform

Databricks Unified Analytics Platform 9.0
Based on 1 answer
This has been very useful in my organization for shared notebooks, integrated data pipeline automation and data sources integrations. Integration with AWS is seamless. Non tech users can easily learn how to use Databricks. You can have your company LDAP connect to it for login based access controls to some extent
Anonymous | TrustRadius Reviewer

Support Rating

Amazon EMR

Amazon EMR 9.3
Based on 4 answers
AWS and EMR support are on par with the best out there. You pay a premium for the support but they can save you time and money by quickly resolving issues or helping you get your problem taken care of. They are competing with Google and MS, and it shows in their support.
Anonymous | TrustRadius Reviewer

Databricks Unified Analytics Platform

No score
No answers yet
No answers on this topic

Alternatives Considered

Amazon EMR

The alternatives to EMR are mainly hadoop distributions owned by the 3 companies above. I have not used the other distributions so it is difficult to comment, but the general tradeoff is, at the cost of a longer setup time and more infra management, you get more flexible versioning and potentially faster access to newer versions of some frameworks such as Spark.
Anonymous | TrustRadius Reviewer

Databricks Unified Analytics Platform

Easier to set up and get started. Less of a learning curve.
Anonymous | TrustRadius Reviewer

Return on Investment

Amazon EMR

  • It was obviously cheaper and convenient to use as most of our data processing and pipelines are on AWS. It was fast and readily available with a click and that saved a ton of time rather than having to figure out the down time of the cluster if its on premises.
  • It saved time on processing chunks of big data which had to be processed in short period with minimal costs. EMR solved this as the cluster setup time and processing was simple, easy, cheap and fast.
  • It had a negative impact as it was very difficult in submitting the test jobs as it lags a UI to submit spark code snippets.
Anonymous | TrustRadius Reviewer

Databricks Unified Analytics Platform

  • Rapid growth of analytics within our company.
  • Cost model aligns with usage allowing us to make a reasonable initial investment and scale the cost as we realize the value.
  • Platform is easy to learn and Databricks provides excellent support and training.
  • Platform does not require a large DevOPs investment
Anonymous | TrustRadius Reviewer

Pricing Details

Amazon EMR

General

Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No

Databricks Unified Analytics Platform

General

Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No

Rating Summary

Likelihood to Recommend

Amazon EMR
8.5
Databricks Unified Analytics Platform
8.9

Usability

Amazon EMR
8.3
Databricks Unified Analytics Platform
9.0

Support Rating

Amazon EMR
9.3
Databricks Unified Analytics Platform

Add comparison