Reviews (1-4 of 4)
MapR is being used by our business research department to collect and evaluate media data. We've also used MapR in creating test data for testing performance with other technologies. It works well with other big data technologies like HBase, HIVE, and Spark. It does provide a performance advantage when used with HBase.
- MapR allows easy integration with HBase and MapR DB.
- Easy trial server setup for product testing.
- Excellent training program to help new users get up-to-date with MapR and related products.
- HBase training needs to have more materials that are questioned in the HBase certification.
Read Chavez Kattick's full review
MapR will be a benefit with any application where HBase is used and performance is a big factor.
We deploy MapR for large corporations deploying big data projects. Our software installs and configures MapR plain or fully configured with spreadsheets. The flexibility MapR has allows us to provision backend nodes with the proper networking and disk configuration automatically for optimized performance. It is being used by several corporations we support in large scale big data deployments for fraud detection and user behavior analysis.
- MapR is fast. We were able to beat the Terasort in record in 2012 on 360 nodes during the initial deployment of a cluster that is now 4000 nodes.
- MapR is reliable. We rarely if ever have problems deploying MapR. It's the kind of software that "just works."
- MapR scales. We have a client using MapR in all their big data clusters, ranging from 50 to 630 machines. Test, development, and production all deploy MapR.
- I think MapR's main problem is name recognition. Hortonworks and Cloudera both are big names in the industry, but their deployment mechanisms are a little more difficult to use, especially when trying to fully automate it's deployment.
- Documentation could always be better. But really, if that's your main weakness, it's everybody's weakness.
Read this authenticated review
MapR is more well-suited for people who know what they are doing. I consider MapR the Hadoop distribution professionals use.
December 01, 2015
Score 7 out of 10
We were implementing a fully relational database on top of HBase. This was not a "SQL on Hadoop" project where we supported a subset of SQL. We developed a fully SQL-92 (partially SQL-99) compliant database on top of HBase. We supported all three major commercial distributions of HBase (Cloudera, Hortonworks, and MapR).
- MapR had very fast I/O throughput. The write speed was several times faster than what we could achieve with the other Hadoop vendors (Cloudera and Hortonworks). This is because MapR does not use HDFS, which is essentially a "meta filesystem". HDFS is built on top of the filesystem provided by the OS. MapR has their filesystem called MapR-FS, which is a true filesystem and accesses the raw disk drives.
- The MapR filesystem is very easy to integrate with other Linux filesystems. When working with HDFS from Apache Hadoop, you usually have to use either the HDFS API or various Hadoop/HDFS command line utilities to interact with HDFS. You cannot use command line utilities native to the host operation system, which is usually Linux. At least, it is not easily done without setting up NFS, gateways, etc. With MapR-FS, you can mount the filesystem within Linux and use the standard Unix commands to manipulate files.
- The HBase distribution provided by MapR is very similar to the Apache HBase distribution. Cloudera and Hortonworks add GUIs and other various tools on top of their HBase distributions. The MapR HBase distribution is very similar to the Apache distribution, which is nice if you are more accustomed to using Apache HBase.
- The MapR web UI console is pretty basic. When you compare it to Cloudera Manager and Apache Ambari (ships with Hortonworks), it is definitely in third place. MapR has definitely invested heavily in file system performance with their MapR-FS, but they should invest a bit more in making it easier to administer and manage a MapR cluster.
- MapR should tune their MapR-FS to work better with HBase. Once again, MapR-FS has invested heavily in their own proprietary technology such as the MapR-DB in this case. MapR-DB is a "wire compatible" version of HBase, but it is a bit of a different beast from HBase. What this means is that we ran into performance issues when running vanilla HBase on MapR-FS. Basically, the write throughput was so amazingly fast for the MapR-FS that it caused compaction storms with HBase. Slowing down the HBase flushes actually improved overall system throughput for HBase on MapR-FS.
Read this authenticated review
If you need Hadoop and just need raw speed for I/O and have a Hadoop savvy group of engineers who don't need/like web UIs, then MapR is a great fit for you. If you are new to Hadoop or have DevOps folks that are not Hadoop gurus, choosing MapR as your Hadoop vendor will have a steeper learning curve as you will need to do more training and build more admin consoles for them.
My team was maintaining multiple Hadoop clusters on a high UCS hardware configuration powered by MapR. We were also maintaining a big cluster in a production environment and other clusters for development, QA, disaster recovery and POC. All clusters were configured with high availability. Multiple internal teams used to run their application jobs on our cluster. My team was responsible for managing and maintaining these clusters. We evaluated and implemented new big data and related tools introduced by Mapr. The goal was to make sure application customers using our cluster stay happy. Day to day jobs on our cluster include traditional Java MapReduce, Streaming, Pig, Hive, Mapr-Tables and other in-memory application jobs like spark for analytics on our company internal data. Nearly a total of 50 different use cases using Hadoop were implemented in our various clusters. At times we required getting support from Mapr on complex issues which could not be resolved by my team.
- Out of the box high availability on multiple Hadoop services, which will really bring enterprise standards. High availability of JobTracker, CLDB in Hadoop 1.x, HA for Impala services etc. Less headache for my team when it comes to service failure.
- Performance enhancements when migrated from Hbase to Mapr Tables.
- HDFS-NFS integration pioneer.
- Volume concept of HDFS storage allocation which could be controlled from MCS GUI was great.
- It takes time to get latest versions of Apache ecosystem tools released as it has to be adapted.
- When you have issues related to Mapr-FS or Mapr Tables, its hard to figure them out by ourselves.
- Sometime new ecosystem tools versions are released without proper QA.
Read this authenticated review
Choose according to your use case. In our situation, we never had to worry about service interruptions. When we kicked off a couple of years ago, Mapr was doing excellent work. Support is not great, [you'll] be lucky to get good support engineers on request. If you have issues with MaprFS or Mapr Tables, never waste time looking into it.