Apache Spark

99 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 8.6 out of 101

Databricks Unified Analytics Platform

10 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 8.3 out of 101

Add comparison

Likelihood to Recommend

Apache Spark

Apache Spark has rich APIs for regular data transformations or for ML workloads or for graph workloads, whereas other systems may not such a wide range of support. Choose it when you need to perform data transformations for big data as offline jobs, whereas use MongoDB-like distributed database systems for more realtime queries.
Nitin Pasumarthy profile photo

Databricks Unified Analytics Platform

  • DB generally fits 95% of what you need to do
  • Primarily the ability to transform data and or do ad-hoc DS work
No photo available

Pros

  • Machine Learning.
  • Data Analysis
  • WorkFlow process (faster than MapReduce).
  • SQL connector to multiple data sources
Anson Abraham profile photo
  • There is databricks community, which is a free version. It is available for beginners to have an easy start with a big data platform. It does not have every feature of the full version but is still adequate for extremely new coders.
  • There are many resourceful training elements that are available to developers, data scientists, data engineers and other IT professionals to learn Apache Spark.
Ann Le profile photo

Cons

  • Memory management. Very weak on that.
  • PySpark not as robust as scala with spark.
  • spark master HA is needed. Not as HA as it should be.
  • Locality should not be a necessity, but does help improvement. But would prefer no locality
Anson Abraham profile photo
  • Better Localized Testing
  • When they were primarily OSS Spark; it was easier to test/manage releases versus the newer DB Runtime. Wish there was more configuration in Runtime less pick a version.
  • Graphing Support went non-existent; when it was one of their compelling general engine.
No photo available

Alternatives Considered

vs MapRedce, it was faster and easier to manage. Especially for Machine Learning, where MapReduce is lacking. Also Apache Storm was slower and didn't scale as much as Spark does. Spark elasticity was easier to apply compared to storm and MapReduce.managing resources for Spark was easier compared to storm as well. MapReduce is slower than spark.
Anson Abraham profile photo
When we started using it, only the notebook experience was mature. However, DB was very helpful giving us direct support to get onto their platform. Really there was little in the way to compare to them at the time. AWS has services but not the same low-cost angle
No photo available

Return on Investment

  • Workflow process using spark went from 1 day to 2 hours
  • Spark Streaming allowed for quick determiniation of data validity
  • spark on yarn was good for manangement. But Spark with Kubernetes was easier to use.
Anson Abraham profile photo
  • Machine learning is a very new concept and not many universities offer to teach it. My school and a few others have been utilizing Databricks as one of the tools to teach and learn machine learning. By doing this, my university is creating a strong future workforce for the job market.
Ann Le profile photo

Pricing Details

Apache Spark

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details

Databricks Unified Analytics Platform

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details