127 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow noopener noreferrer'>trScore algorithm: Learn more.</a>
Score 8.7 out of 100
24 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow noopener noreferrer'>trScore algorithm: Learn more.</a>
Score 8.5 out of 100

Highlights

Apache Spark and Databricks Unified Analytics Platform are ‘big data’ processing and analytics tools. Apache Spark is an open-source general data processing engine. On the other hand, Databricks Unified Analytics Platform is a paid analytics and data processing platform designed to make use of Apache Spark, though it adds additional support, services, and features. 

Both Apache Spark and Databricks Unified Analytics Platform are primarily used by large enterprises, with a significant user base among mid-sized companies as well. Both tools focus on big data processing, often making them overkill for the needs of smaller businesses.

Features

Apache Spark is a core component of Databricks Unified Analytics Platform, which means that it’s difficult to compare them directly. Essentially, an organization would not be able to use Databricks Unified Analytics Platform without also using Apache Spark. In this section, we’ll examine the advantages of Apache Spark as a general data processing engine, then discuss the benefits of Databricks Unified Analytics Platform as a platform.

Apache Spark is designed to be a lightning-fast data processing engine with multiple use cases. Its in-memory processing design means it can run with very few disk read/write operations, which helps it run quickly even on enormous datasets. Developers report that its SQL interface and object-oriented design make it intuitive to understand and write code for. Users also appreciate its rich set of APIs for cluster management and ETL procedures. As an open-source tool with wide industry adoption, Apache Spark has a large support community and plenty of recommended solutions to common problems—and, of course, it’s free.

If Apache Spark is the engine, Databricks Unified Analytics Platform is the whole car: a full-service data analytics solution with collaboration features, machine learning tools, data lake, and data pipeline capability. The service simplifies and streamlines the setup and maintenance of Apache Spark clusters, adding data security and automatic cluster management features. It supports multiple different languages, such as Scala and Python, making it easy for developers to create data pipelines in languages they’re comfortable with. It also adds integrations for applications and services such as Microsoft Azure or AWS. Dedicated customer support teams assist clients with custom features or exceptions, tailoring the platform to their needs.

Limitations

Consider the limitations of Apache Spark and Databricks Unified Analytics Platform before adopting one or both of them. 

As a standalone tool, Apache Spark requires supporting tools to fill in capability gaps. For example, users will need to provide a database infrastructure to store the information Apache Spark works with, which requires separate expertise and development. Apache Spark’s in-memory processing may be fast, but it also means high memory requirements, which can get expensive very quickly. Some users found that the tool isn’t well-suited for real-time analytics, while others wished for more integrated data security features. Finally, Apache Spark may be designed intuitively, but it’s still a complicated piece of software with a significant learning curve. And since it’s open-source, there’s no dedicated training or customer support.

Databricks Unified Analytics Platform offers additional features and services. However, for businesses with smaller datasets or more focused processing needs, the full-service platform may be more than they need. Although its UI is intuitive, some Databricks Unified Analytics Platform users suffered from long loading times or problems with language interpreter settings. Other users found the platform’s in-house documentation insufficient and ended up resorting to outside sources for support. Databricks Unified Analytics Platform also isn’t free; businesses will have to pay for the amount of processing they need.

Pricing

Apache Spark is open-source and free to download.

Databricks Unified Analytics Platform offers tiered pricing based on per-second usage. Pricing varies depending on the service tier and cloud platform (Azure or AWS) used. More pricing information is available on the vendor’s website. 

Likelihood to Recommend

Apache Spark

The software appears to run more efficiently than other big data tools, such as Hadoop. Given that, Apache Spark is well-suited for querying and trying to make sense of very, very large data sets. The software offers many advanced machine learning and econometrics tools, although these tools are used only partially because very large data sets require too much time when the data sets get too large. The software is not well-suited for projects that are not big data in size. The graphics and analytical output are subpar compared to other tools.
Thomas Young | TrustRadius Reviewer

Databricks Unified Analytics Platform

Databricks has helped my teams write PySpark and Spark SQL jobs and test them out before formally integrating them in Spark jobs. Through Databricks we can create parquet and JSON output files. Datamodelers and scientists who are not very good with coding can get good insight into the data using the notebooks that can be developed by the engineers.
Anonymous | TrustRadius Reviewer

Feature Rating Comparison

Platform Connectivity

Apache Spark
Databricks Unified Analytics Platform
8.3
Connect to Multiple Data Sources
Apache Spark
Databricks Unified Analytics Platform
9.0
Extend Existing Data Sources
Apache Spark
Databricks Unified Analytics Platform
9.0
Automatic Data Format Detection
Apache Spark
Databricks Unified Analytics Platform
7.0

Data Exploration

Apache Spark
Databricks Unified Analytics Platform
6.0
Visualization
Apache Spark
Databricks Unified Analytics Platform
6.0
Interactive Data Analysis
Apache Spark
Databricks Unified Analytics Platform
6.0

Data Preparation

Apache Spark
Databricks Unified Analytics Platform
8.0
Interactive Data Cleaning and Enrichment
Apache Spark
Databricks Unified Analytics Platform
8.0
Data Transformations
Apache Spark
Databricks Unified Analytics Platform
9.0
Data Encryption
Apache Spark
Databricks Unified Analytics Platform
7.0
Built-in Processors
Apache Spark
Databricks Unified Analytics Platform
8.0

Platform Data Modeling

Apache Spark
Databricks Unified Analytics Platform
8.3
Multiple Model Development Languages and Tools
Apache Spark
Databricks Unified Analytics Platform
9.0
Automated Machine Learning
Apache Spark
Databricks Unified Analytics Platform
8.0
Single platform for multiple model development
Apache Spark
Databricks Unified Analytics Platform
9.0
Self-Service Model Delivery
Apache Spark
Databricks Unified Analytics Platform
7.0

Model Deployment

Apache Spark
Databricks Unified Analytics Platform
7.5
Flexible Model Publishing Options
Apache Spark
Databricks Unified Analytics Platform
7.0
Security, Governance, and Cost Controls
Apache Spark
Databricks Unified Analytics Platform
8.0

Pros

Apache Spark

  • Rich APIs for data transformation making for very each to transform and prepare data in a distributed environment without worrying about memory issues
  • Faster in execution times compare to Hadoop and PIG Latin
  • Easy SQL interface to the same data set for people who are comfortable to explore data in a declarative manner
  • Interoperability between SQL and Scala / Python style of munging data
Nitin Pasumarthy | TrustRadius Reviewer

Databricks Unified Analytics Platform

  • Extremely Flexible in Data Scenarios
  • Fantastic Performance
  • DB is always updating the system so we can have latest features.
Anonymous | TrustRadius Reviewer

Cons

Apache Spark

  • Memory management. Very weak on that.
  • PySpark not as robust as scala with spark.
  • spark master HA is needed. Not as HA as it should be.
  • Locality should not be a necessity, but does help improvement. But would prefer no locality
Anson Abraham | TrustRadius Reviewer

Databricks Unified Analytics Platform

  • The navigation through which one would create a workspace is a bit confusing at first. It takes a couple minutes to figure out how to create a folder and upload files since it is not the same as traditional file systems such as box.com
  • Also, when you create a table, if you forgot to copy the link where the table is stored, it is hard to relocate it. Most of the time I would have to delete the table and re-created.
Ann Le | TrustRadius Reviewer

Usability

Apache Spark

Apache Spark 8.7
Based on 3 answers
Apache integrates with multiple big data frameworks. It does not exert too much load on the disks. Moreover, it is easy to program and use. It reduces the headache of using different applications separately through its high-level APIs. Big data processing has never been as easy as it is with Apache Spark.
Partha Protim Pegu | TrustRadius Reviewer

Databricks Unified Analytics Platform

Databricks Unified Analytics Platform 9.0
Based on 1 answer
This has been very useful in my organization for shared notebooks, integrated data pipeline automation and data sources integrations. Integration with AWS is seamless. Non tech users can easily learn how to use Databricks. You can have your company LDAP connect to it for login based access controls to some extent
Anonymous | TrustRadius Reviewer

Support Rating

Apache Spark

Apache Spark 8.2
Based on 5 answers
1. It integrates very well with scala or python.2. It's very easy to understand SQL interoperability.3. Apache is way faster than the other competitive technologies.4. The support from the Apache community is very huge for Spark.5. Execution times are faster as compared to others.6. There are a large number of forums available for Apache Spark.7. The code availability for Apache Spark is simpler and easy to gain access to.8. Many organizations use Apache Spark, so many solutions are available for existing applications.
Yogesh Mhasde | TrustRadius Reviewer

Databricks Unified Analytics Platform

No score
No answers yet
No answers on this topic

Alternatives Considered

Apache Spark

Spark in comparison to similar technologies ends up being a one stop shop. You can achieve so much with this one framework instead of having to stitch and weave multiple technologies from the Hadoop stack, all while getting incredibility performance, minimal boilerplate, and getting the ability to write your application in the language of your choosing.
Anonymous | TrustRadius Reviewer

Databricks Unified Analytics Platform

Easier to set up and get started. Less of a learning curve.
Anonymous | TrustRadius Reviewer

Return on Investment

Apache Spark

  • It has had a very positive impact, as it helps reduce the data processing time and thus helps us achieve our goals much faster.
  • Being easy to use, it allows us to adapt to the tool much faster than with others, which in turn allows us to access various data sources such as Hadoop, Apache Mesos, Kubernetes, independently or in the cloud. This makes it very useful.
  • It was very easy for me to use Apache Spark and learn it since I come from a background of Java and SQL, and it shares those basic principles and uses a very similar logic.
Carla Borges | TrustRadius Reviewer

Databricks Unified Analytics Platform

  • Rapid growth of analytics within our company.
  • Cost model aligns with usage allowing us to make a reasonable initial investment and scale the cost as we realize the value.
  • Platform is easy to learn and Databricks provides excellent support and training.
  • Platform does not require a large DevOPs investment
Anonymous | TrustRadius Reviewer

Pricing Details

Apache Spark

General

Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No

Databricks Unified Analytics Platform

General

Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No

Rating Summary

Likelihood to Recommend

Apache Spark
8.5
Databricks Unified Analytics Platform
8.9

Usability

Apache Spark
8.7
Databricks Unified Analytics Platform
9.0

Support Rating

Apache Spark
8.2
Databricks Unified Analytics Platform

Add comparison