What users are saying about
102 Ratings
2 Ratings
102 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 8.5 out of 101
2 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 8 out of 101

Add comparison

Likelihood to Recommend

Apache Spark

The software appears to run more efficiently than other big data tools, such as Hadoop. Given that, Apache Spark is well-suited for querying and trying to make sense of very, very large data sets. The software offers many advanced machine learning and econometrics tools, although these tools are used only partially because very large data sets require too much time when the data sets get too large. The software is not well-suited for projects that are not big data in size. The graphics and analytical output are subpar compared to other tools.
Thomas Young profile photo

SAP Vora

I spent more than 1 year with SAP Vora, SAP Datahub and SAP Leonardo with ML, iOt. I believe this product has potential but it is not easy to adopt. SAP has to keep in mind how open-source big data technologies are able to deliver quick results. I know SAP is stabilizing and fighting hard against many open source technologies, but it still has a long way to go there.
Dhinesh Kumar Ganeshan,PMP,CSM profile photo

Pros

  • Ease of use, the Spark API allows for minimal boilerplate and can be written in a variety of languages including Python, Scala, and Java.
  • Performance, for most applications we have found that jobs are more performant running via Spark than other distributed processing technologies like Map-Reduce, Hive, and Pig.
  • Flexibility, the frameworks comes with support for streaming, batch processing, sql queries, machine learning, etc. It can be used in a variety of applications without needing to integrate a lot of other distributed processing technologies.
No photo available
  • Modelling with SAP HANA and Hadoop
  • Realtime Analysis using Vora and HANA as a Streaming engine
  • Time series Analysis on large chunks of datasets
  • Machine learning capabilities on Hadoop tables and spark contexts
Dhinesh Kumar Ganeshan,PMP,CSM profile photo

Cons

  • Increase the information and trainings that come with the application, especially for debugging since the process is difficult to understand.
  • It should be more attentive to users and make tutorials, to reduce the learning curve.
  • There should be more grouping algorithms.
Carla Borges profile photo
  • Vora 2.0 in on premise scenarios could be improved, as adoption of the cloud is not an easy sell.
  • Kubernetes and Docker integration need to be more seamless and quick to understand. If this is simplified, it will be easy to adopt
  • Data hub orchestration and integrations could be simplified so that quick adoption within SAP BW, ECC, S4 HANa scenarios is possible.
Dhinesh Kumar Ganeshan,PMP,CSM profile photo

Alternatives Considered

There are a few newer frameworks for general processing like Flink, Beam, frameworks for streaming like Samza and Storm, and traditional Map-Reduce. I think Spark is at a sweet spot where its clearly better than Map-Reduce for many workflows yet has gotten a good amount of support in the community that there is little risk in deploying it. It also integrates batch and streaming workflows and APIs, allowing an all in package for multiple use-cases.
No photo available
We selected SAP VORA because we needed acclerated integration with different sources with a huge amount of data. Also the data de-duplication has easily eliminated the different entries in a fastest and enhanced way, which ultimately leads us and the customer to prefer SAP Vora against different products, and has helped eliminate any limitations in using and playing with our data lake.
Pradeep Bele profile photo

Return on Investment

  • By learning Spark, we can become certified and/or provide proper recommendations or implementations on Spark solutions.
  • With a background in Hadoop distributed processes, it has been easy to understand and diagnose how Spark handles the transfer of data within a cluster. Especially when using YARN as the resource manager and HDFS as the data source.
  • Staying up to date with the latest changes to Spark has become a repetitive task. While most Hadoop distributions only support Spark 1.6 at the moment, Spark 2.0 has introduced some useful features, but those require a re-write of existing applications.
Jordan Moore profile photo
  • Negative impact would be Poc and RFI will need more time to adopt and decision making gets delayed
  • Positive impact would be it's a great leap from SAP to adopt a Big data technologies and AI within cloud stream. But selling is going to take time.
Dhinesh Kumar Ganeshan,PMP,CSM profile photo

Pricing Details

Apache Spark

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details

SAP Vora

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details