What users are saying about

Apache Spark

97 Ratings

MapR

14 Ratings

Apache Spark

97 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 8.6 out of 101

MapR

14 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 7.8 out of 101

Add comparison

Likelihood to Recommend

Apache Spark

If you are running a distributed environment and are running applications that make use of batch processing, analytics, streaming, machine learning, or graphing then I cannot recommend Spark enough. It is easy to get going, simple to learn (relative to similar technologies), and can be used in a variety of use cases. All while giving you great performance.
No photo available

MapR

MapR is more well-suited for people who know what they are doing. I consider MapR the Hadoop distribution professionals use.
No photo available

Pros

  • Ease of use, the Spark API allows for minimal boilerplate and can be written in a variety of languages including Python, Scala, and Java.
  • Performance, for most applications we have found that jobs are more performant running via Spark than other distributed processing technologies like Map-Reduce, Hive, and Pig.
  • Flexibility, the frameworks comes with support for streaming, batch processing, sql queries, machine learning, etc. It can be used in a variety of applications without needing to integrate a lot of other distributed processing technologies.
No photo available
  • Out of the box high availability on multiple Hadoop services, which will really bring enterprise standards. High availability of JobTracker, CLDB in Hadoop 1.x, HA for Impala services etc. Less headache for my team when it comes to service failure.
  • Performance enhancements when migrated from Hbase to Mapr Tables.
  • HDFS-NFS integration pioneer.
  • Volume concept of HDFS storage allocation which could be controlled from MCS GUI was great.
No photo available

Cons

  • Resource heavy, jobs, in general, can be very memory intensive and you will want the nodes in your cluster to reflect that.
  • Debugging, it has gotten better with every release but sometimes it can be difficult to debug an error due to ambiguous or misleading exceptions and stack traces.
No photo available
  • It takes time to get latest versions of Apache ecosystem tools released as it has to be adapted.
  • When you have issues related to Mapr-FS or Mapr Tables, its hard to figure them out by ourselves.
  • Sometime new ecosystem tools versions are released without proper QA.
No photo available

Alternatives Considered

I prefer Apache Spark compared to Hadoop, since in my experience Spark has more usability and comes equipped with simple APIs for Scala, Python, Java and Spark SQL, as well as provides feedback in REPL format on the commands. At the same time, Apache Spark seems to have the best performance in the processing of large data that works in memory and, therefore, more processes can be downloaded on Spark than on Hadoop, despite the fact that Hadoop is also a very useful tool.
Carla Borges profile photo
When we were shopping, Mapr had the momentum, high availability even on Hadoop 1.x, an improved file system and better a central control system. Now it looks like the situation has changed a lot.
No photo available

Return on Investment

  • Apache Spark has faster performance compared to MapReduce.
  • Combination of Python & Spark is the best. Shorter code, faster and efficient performance.
  • Can replace RDBMS
Kartik Chavan profile photo
  • Less manual intervention for maintaining a cluster.
No photo available

Pricing Details

Apache Spark

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details

MapR

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details