109 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 8.4 out of 101
16 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 8.7 out of 101

Likelihood to Recommend

Apache Spark

The software appears to run more efficiently than other big data tools, such as Hadoop. Given that, Apache Spark is well-suited for querying and trying to make sense of very, very large data sets. The software offers many advanced machine learning and econometrics tools, although these tools are used only partially because very large data sets require too much time when the data sets get too large. The software is not well-suited for projects that are not big data in size. The graphics and analytical output are subpar compared to other tools.
Thomas Young profile photo

Databricks Unified Analytics Platform

Databricks has helped my teams write PySpark and Spark SQL jobs and test them out before formally integrating them in Spark jobs. Through Databricks we can create parquet and JSON output files. Datamodelers and scientists who are not very good with coding can get good insight into the data using the notebooks that can be developed by the engineers.
No photo available

Feature Rating Comparison

Platform Connectivity

Apache Spark
Databricks Unified Analytics Platform
8.3
Connect to Multiple Data Sources
Apache Spark
Databricks Unified Analytics Platform
9.0
Extend Existing Data Sources
Apache Spark
Databricks Unified Analytics Platform
9.0
Automatic Data Format Detection
Apache Spark
Databricks Unified Analytics Platform
7.0

Data Exploration

Apache Spark
Databricks Unified Analytics Platform
6.0
Visualization
Apache Spark
Databricks Unified Analytics Platform
6.0
Interactive Data Analysis
Apache Spark
Databricks Unified Analytics Platform
6.0

Data Preparation

Apache Spark
Databricks Unified Analytics Platform
8.0
Interactive Data Cleaning and Enrichment
Apache Spark
Databricks Unified Analytics Platform
8.0
Data Transformations
Apache Spark
Databricks Unified Analytics Platform
9.0
Data Encryption
Apache Spark
Databricks Unified Analytics Platform
7.0
Built-in Processors
Apache Spark
Databricks Unified Analytics Platform
8.0

Platform Data Modeling

Apache Spark
Databricks Unified Analytics Platform
8.3
Multiple Model Development Languages and Tools
Apache Spark
Databricks Unified Analytics Platform
9.0
Automated Machine Learning
Apache Spark
Databricks Unified Analytics Platform
8.0
Single platform for multiple model development
Apache Spark
Databricks Unified Analytics Platform
9.0
Self-Service Model Delivery
Apache Spark
Databricks Unified Analytics Platform
7.0

Model Deployment

Apache Spark
Databricks Unified Analytics Platform
7.5
Flexible Model Publishing Options
Apache Spark
Databricks Unified Analytics Platform
7.0
Security, Governance, and Cost Controls
Apache Spark
Databricks Unified Analytics Platform
8.0

Pros

Apache Spark

  • Rich APIs for data transformation making for very each to transform and prepare data in a distributed environment without worrying about memory issues
  • Faster in execution times compare to Hadoop and PIG Latin
  • Easy SQL interface to the same data set for people who are comfortable to explore data in a declarative manner
  • Interoperability between SQL and Scala / Python style of munging data
Nitin Pasumarthy profile photo

Databricks Unified Analytics Platform

  • Process raw data in One Lake (S3) env to relational tables and views
  • Share notebooks with our business analysts so that they can use the queries and generate value out of the data
  • Try out PySpark and Spark SQL queries on raw data before using them in our Spark jobs
  • Modern day ETL operations made easy using Databricks. Provide access mechanism for different set of customers
No photo available

Cons

Apache Spark

  • Documentation could be better as I usually end up going to other sites / blogs to understand the concepts better
  • More APIs are to be ported to MLlib as only very few algorithms are available at least in clustering segment
Nitin Pasumarthy profile photo

Databricks Unified Analytics Platform

  • The navigation through which one would create a workspace is a bit confusing at first. It takes a couple minutes to figure out how to create a folder and upload files since it is not the same as traditional file systems such as box.com
  • Also, when you create a table, if you forgot to copy the link where the table is stored, it is hard to relocate it. Most of the time I would have to delete the table and re-created.
Ann Le profile photo

Usability

Apache Spark

No score
No answers yet
No answers on this topic

Databricks Unified Analytics Platform

Databricks Unified Analytics Platform 9.0
Based on 1 answer
This has been very useful in my organization for shared notebooks, integrated data pipeline automation and data sources integrations. Integration with AWS is seamless. Non tech users can easily learn how to use Databricks. You can have your company LDAP connect to it for login based access controls to some extent
No photo available

Alternatives Considered

Apache Spark

Spark in comparison to similar technologies ends up being a one stop shop. You can achieve so much with this one framework instead of having to stitch and weave multiple technologies from the Hadoop stack, all while getting incredibility performance, minimal boilerplate, and getting the ability to write your application in the language of your choosing.
No photo available

Databricks Unified Analytics Platform

I also use Microsoft Azure Machine Learning in parallel with Databricks. They use different file formats which teach me to be flexible and able to write different programs. They are equally useful to me and I would like to master both platforms for any future usage. I do prefer Databricks because it could be free if I decided to go with the Databricks Community Edition only.
Ann Le profile photo

Return on Investment

Apache Spark

  • Apache Spark has faster performance compared to MapReduce.
  • Combination of Python & Spark is the best. Shorter code, faster and efficient performance.
  • Can replace RDBMS
Kartik Chavan profile photo

Databricks Unified Analytics Platform

  • ROI for us has been tremendous. Time to market by processing raw data in our big data infrastructure has been pretty fast.
  • Non engineers can easily use Databricks, hence helping business customers.
  • Thousands of different data combinations can easily be joined and used by our data teams.
No photo available

Pricing Details

Apache Spark

General

Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details

Databricks Unified Analytics Platform

General

Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details

Add comparison