What users are saying about
111 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow noopener noreferrer'>trScore algorithm: Learn more.</a>
Score 8.5 out of 101
13 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow noopener noreferrer'>trScore algorithm: Learn more.</a>
Score 7.5 out of 101

Likelihood to Recommend

Apache Spark

The software appears to run more efficiently than other big data tools, such as Hadoop. Given that, Apache Spark is well-suited for querying and trying to make sense of very, very large data sets. The software offers many advanced machine learning and econometrics tools, although these tools are used only partially because very large data sets require too much time when the data sets get too large. The software is not well-suited for projects that are not big data in size. The graphics and analytical output are subpar compared to other tools.
Thomas Young profile photo

Data Science Workbench

  • If you already have a Cloudera partnership and a cluster, having this is a no brainer.
  • It integrates well with your existing ecosystem and it immediately starts working on projects, accessing full datasets and share analysis and results.
  • With the inclusion of Kubernetes, CPU and memory across worker nodes can be managed effectively.
Bharadwaj (Brad) Chivukula profile photo

Feature Rating Comparison

Platform Connectivity

Apache Spark
Data Science Workbench
7.0
Connect to Multiple Data Sources
Apache Spark
Data Science Workbench
6.0
Extend Existing Data Sources
Apache Spark
Data Science Workbench
7.0
Automatic Data Format Detection
Apache Spark
Data Science Workbench
7.0
MDM Integration
Apache Spark
Data Science Workbench
8.0

Data Exploration

Apache Spark
Data Science Workbench
9.0
Visualization
Apache Spark
Data Science Workbench
9.0
Interactive Data Analysis
Apache Spark
Data Science Workbench
9.0

Data Preparation

Apache Spark
Data Science Workbench
7.8
Interactive Data Cleaning and Enrichment
Apache Spark
Data Science Workbench
8.0
Data Transformations
Apache Spark
Data Science Workbench
8.0
Data Encryption
Apache Spark
Data Science Workbench
8.0
Built-in Processors
Apache Spark
Data Science Workbench
7.0

Platform Data Modeling

Apache Spark
Data Science Workbench
9.7
Multiple Model Development Languages and Tools
Apache Spark
Data Science Workbench
9.0
Single platform for multiple model development
Apache Spark
Data Science Workbench
10.0
Self-Service Model Delivery
Apache Spark
Data Science Workbench
10.0

Model Deployment

Apache Spark
Data Science Workbench
7.0
Flexible Model Publishing Options
Apache Spark
Data Science Workbench
10.0
Security, Governance, and Cost Controls
Apache Spark
Data Science Workbench
4.0

Pros

Apache Spark

  • Ease of use, the Spark API allows for minimal boilerplate and can be written in a variety of languages including Python, Scala, and Java.
  • Performance, for most applications we have found that jobs are more performant running via Spark than other distributed processing technologies like Map-Reduce, Hive, and Pig.
  • Flexibility, the frameworks comes with support for streaming, batch processing, sql queries, machine learning, etc. It can be used in a variety of applications without needing to integrate a lot of other distributed processing technologies.
No photo available

Data Science Workbench

  • One single IDE (browser based application) that makes Scala, R, Python integrated under one tool
  • For larger organizations/teams, it lets you be self reliant
  • As it sits on your cluster, it has very easy access of all the data on the HDFS
  • Linking with Github is a very good way to keep the code versions intact
Bharadwaj (Brad) Chivukula profile photo

Cons

Apache Spark

  • PySpark does not have the same ease of use and functionality that Pandas does yet.
No photo available

Data Science Workbench

  • Not as great as RStudio; lacks some features when compared with it
  • It is quite simple still (because its very early in its initiative), and companies may want to wait until they see a more developed product
Bharadwaj (Brad) Chivukula profile photo

Support

Apache Spark

No score
No answers yet
No answers on this topic

Data Science Workbench

Data Science Workbench 5.0
Based on 1 answer
It is expensive and difficult to install and maintain.
No photo available

Alternatives Considered

Apache Spark

It is easy to learn, read and to maintain. It brings the best of the Ruby on Rails framework from Java that helps to create a web service so easily. Communication is one of the most distinctive features of Apache Spark compared to alternative products. You are able to communicate with your colleague in your team who also uses Spark while you are on the phone.
No photo available

Data Science Workbench

Both the tools have similar features and have made it pretty easy to install/deploy/use. Depending on your existing platform (Cloudera vs. Azure) you need to pick the Workbench. Another observation is that Cloudera has better support where you can get feedback on your questions pretty fast (unlike MS). As its a new product, I expect MS to be more efficient in handling customers questions.
Bharadwaj (Brad) Chivukula profile photo

Return on Investment

Apache Spark

  • overall positive impact to the business for analysis of big data using hadoop file system
  • very well received by data scientists in the business despite its shortcoming on analytical dashboarding
Shiv Shivakumar profile photo

Data Science Workbench

  • As the tool itself can access all the HDFS, Spark data easily, the wait time between teams has reduced
  • Installation was a breeze, and ramp up time was fairly easy
Bharadwaj (Brad) Chivukula profile photo

Pricing Details

Apache Spark

General

Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No

Data Science Workbench

General

Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No

Add comparison