What users are saying about
113 Ratings
15 Ratings
113 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow noopener noreferrer'>trScore algorithm: Learn more.</a>
Score 8.4 out of 101
15 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow noopener noreferrer'>trScore algorithm: Learn more.</a>
Score 8.6 out of 101

Likelihood to Recommend

Apache Spark

The software appears to run more efficiently than other big data tools, such as Hadoop. Given that, Apache Spark is well-suited for querying and trying to make sense of very, very large data sets. The software offers many advanced machine learning and econometrics tools, although these tools are used only partially because very large data sets require too much time when the data sets get too large. The software is not well-suited for projects that are not big data in size. The graphics and analytical output are subpar compared to other tools.
Thomas Young profile photo

MapR

MapR is more well-suited for people who know what they are doing. I consider MapR the Hadoop distribution professionals use.
No photo available

Pros

Apache Spark

  • Rich APIs for data transformation making for very each to transform and prepare data in a distributed environment without worrying about memory issues
  • Faster in execution times compare to Hadoop and PIG Latin
  • Easy SQL interface to the same data set for people who are comfortable to explore data in a declarative manner
  • Interoperability between SQL and Scala / Python style of munging data
Nitin Pasumarthy profile photo

MapR

  • MapR had very fast I/O throughput. The write speed was several times faster than what we could achieve with the other Hadoop vendors (Cloudera and Hortonworks). This is because MapR does not use HDFS, which is essentially a "meta filesystem". HDFS is built on top of the filesystem provided by the OS. MapR has their filesystem called MapR-FS, which is a true filesystem and accesses the raw disk drives.
  • The MapR filesystem is very easy to integrate with other Linux filesystems. When working with HDFS from Apache Hadoop, you usually have to use either the HDFS API or various Hadoop/HDFS command line utilities to interact with HDFS. You cannot use command line utilities native to the host operation system, which is usually Linux. At least, it is not easily done without setting up NFS, gateways, etc. With MapR-FS, you can mount the filesystem within Linux and use the standard Unix commands to manipulate files.
  • The HBase distribution provided by MapR is very similar to the Apache HBase distribution. Cloudera and Hortonworks add GUIs and other various tools on top of their HBase distributions. The MapR HBase distribution is very similar to the Apache distribution, which is nice if you are more accustomed to using Apache HBase.
No photo available

Cons

Apache Spark

  • Memory management. Very weak on that.
  • PySpark not as robust as scala with spark.
  • spark master HA is needed. Not as HA as it should be.
  • Locality should not be a necessity, but does help improvement. But would prefer no locality
Anson Abraham profile photo

MapR

  • It takes time to get latest versions of Apache ecosystem tools released as it has to be adapted.
  • When you have issues related to Mapr-FS or Mapr Tables, its hard to figure them out by ourselves.
  • Sometime new ecosystem tools versions are released without proper QA.
No photo available

Alternatives Considered

Apache Spark

Spark in comparison to similar technologies ends up being a one stop shop. You can achieve so much with this one framework instead of having to stitch and weave multiple technologies from the Hadoop stack, all while getting incredibility performance, minimal boilerplate, and getting the ability to write your application in the language of your choosing.
No photo available

MapR

I don't believe there is as much support for MapR yet compared to other more widely known products.
Chavez Kattick profile photo

Return on Investment

Apache Spark

  • It has had a very positive impact, as it helps reduce the data processing time and thus helps us achieve our goals much faster.
  • Being easy to use, it allows us to adapt to the tool much faster than with others, which in turn allows us to access various data sources such as Hadoop, Apache Mesos, Kubernetes, independently or in the cloud. This makes it very useful.
  • It was very easy for me to use Apache Spark and learn it since I come from a background of Java and SQL, and it shares those basic principles and uses a very similar logic.
Carla Borges profile photo

MapR

  • Increased employee efficiency for sure. Our clients have various levels of expertise in their deployment and user teams, and we never receive complaints about MapR.
  • MapR is used by one of our financial services clients who uses it for fraud detection and user pattern analysis. They are able to turn around data much faster than they previously had with in-house applications
No photo available

Pricing Details

Apache Spark

General

Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No

MapR

General

Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No

Add comparison