Amazon EMR (Elastic MapReduce) vs. Hortonworks Data Platform

Overview
ProductRatingMost Used ByProduct SummaryStarting Price
Amazon EMR
Score 8.6 out of 10
N/A
Amazon EMR is a cloud-native big data platform for processing vast amounts of data quickly, at scale. Using open source tools such as Apache Spark, Apache Hive, Apache HBase, Apache Flink, Apache Hudi (Incubating), and Presto, coupled with the scalability of Amazon EC2 and scalable storage of Amazon S3, EMR gives analytical teams the engines and elasticity to run Petabyte-scale analysis.N/A
Hortonworks Data Platform
Score 7.0 out of 10
N/A
Hortonworks Data Platform (HDP) is an open source framework for distributed storage and processing of large, multi-source data sets. HDP modernizes IT infrastructure and keeps data secure—in the cloud or on-premises—while helping to drive new revenue streams, improve customer experience, and control costs. Hortonworks merged with Cloudera in eary 2019.N/A
Pricing
Amazon EMR (Elastic MapReduce)Hortonworks Data Platform
Editions & Modules
No answers on this topic
No answers on this topic
Offerings
Pricing Offerings
Amazon EMRHortonworks Data Platform
Free Trial
NoNo
Free/Freemium Version
NoNo
Premium Consulting/Integration Services
NoNo
Entry-level Setup FeeNo setup feeNo setup fee
Additional Details
More Pricing Information
Community Pulse
Amazon EMR (Elastic MapReduce)Hortonworks Data Platform
Considered Both Products
Amazon EMR
Chose Amazon EMR (Elastic MapReduce)
The alternatives to EMR are mainly hadoop distributions owned by the 3 companies above. I have not used the other distributions so it is difficult to comment, but the general tradeoff is, at the cost of a longer setup time and more infra management, you get more flexible …
Chose Amazon EMR (Elastic MapReduce)
Having one of these enterprise edition license comes at its own costs. But, the flexibility to have the cluster spin up with the workbenches and code snippets on the same is really beneficial. Especially, if one had to move out of EMR and consider an option which reduces the …
Hortonworks Data Platform

No answer on this topic

Top Pros
Top Cons
Best Alternatives
Amazon EMR (Elastic MapReduce)Hortonworks Data Platform
Small Businesses

No answers on this topic

No answers on this topic

Medium-sized Companies
Cloudera Manager
Cloudera Manager
Score 9.7 out of 10
Cloudera Manager
Cloudera Manager
Score 9.7 out of 10
Enterprises
IBM Analytics Engine
IBM Analytics Engine
Score 8.8 out of 10
IBM Analytics Engine
IBM Analytics Engine
Score 8.8 out of 10
All AlternativesView all alternativesView all alternatives
User Ratings
Amazon EMR (Elastic MapReduce)Hortonworks Data Platform
Likelihood to Recommend
8.4
(19 ratings)
7.0
(9 ratings)
Usability
8.3
(3 ratings)
-
(0 ratings)
Support Rating
9.0
(3 ratings)
-
(0 ratings)
Implementation Rating
-
(0 ratings)
9.0
(1 ratings)
User Testimonials
Amazon EMR (Elastic MapReduce)Hortonworks Data Platform
Likelihood to Recommend
Amazon AWS
We are running it to perform preparation which takes a few hours on EC2 to be running on a spark-based EMR cluster to total the preparation inside minutes rather than a few hours. Ease of utilization and capacity to select from either Hadoop or spark. Processing time diminishes from 5-8 hours to 25-30 minutes compared with the Ec2 occurrence and more in a few cases.
Read full review
Cloudera
I find HDP easy to use and solves most of the problems for people looking to manage their big data. Evaluating the Hortonworks Data Platform is easy as it is free to download and install in your cluster. Single node cluster available as Sandbox is also easy for POCs.
Read full review
Pros
Amazon AWS
  • Amazon Elastic MapReduce works well for managing analyses that use multiple tools, such as Hadoop and Spark. If it were not for the fact that we use multiple tools, there would be less need for MapReduce.
  • MapReduce is always on. I've never had a problem getting data analyses to run on the system. It's simple to set up data mining projects.
  • Amazon Elastic MapReduce has no problems dealing with very large data sets. It processes them just fine. With that said, the outputs don't come instantaneously. It takes time.
Read full review
Cloudera
  • It does a good job of packaging a lot of big data components into bundles and lets you use the ones you are interested in or need. It supports an extensive list of components which lets us solve many problems.
  • It provides the ability to manage installations and maintenance using Apache Ambari. It helps us in using management packs to install/upgrade components easily. It also helps us add, remove components, add, remove hosts, perform upgrades in a convenient manner. It also provides alerts and notifications and monitors the environment.
  • What they excel in is packaging open source components that are relevant and are useful to solve and complement each other as well as contribute to enhancing those components. They do a great job in the community to keep on top of what would be useful to users, fixing bugs and working with other companies and individuals to make the platform better.
Read full review
Cons
Amazon AWS
  • Sometimes bootstrapping certain tools comes with debugging costs. The tools provided by some of the enterprise editions are great compared to EMR.
  • Like some of the enterprise editions EMR does not provide on premises options.
  • No UI client for saving the workbooks or code snippets. Everything has to go through submitting process. Not really convenient for tracking the job as well.
Read full review
Cloudera
  • Since it doesn't come with propriety tools for big data management, additional integration is need (for query handling, search, etc).
  • It was very straightforward to store clinical data without relations, such as data from sensors of a medical device. But it has limitations when needed to combine the data with other clinical data in structured format (e.g. lab results, diagnosis).
  • Overall look and feel of front-end management tools (e.g. monitoring) are not good. It is not bad but it doesn't look professional.
Read full review
Usability
Amazon AWS
I give Amazon EMR this rating because while it is great at simplifying running big data frameworks, providing the Amazon EMR highlights, product details, and pricing information, and analyzing vast amounts of data, it can be run slow, freeze and glitch sometimes. So overall Amazon EMR is pretty good to use other than some basic issues.
Read full review
Cloudera
No answers on this topic
Support Rating
Amazon AWS
There's a vast group of trained and certified (by AWS) professionals ready to work for anyone that needs to implement, configure or fix EMR. There's also a great amount of documentation that is accessible to anyone who's trying to learn this. And there's also always the help of AWS itself. They have people ready to help you analyze your needs and then make a recommendation.
Read full review
Cloudera
No answers on this topic
Implementation Rating
Amazon AWS
No answers on this topic
Cloudera
Try not to change variable names.
Read full review
Alternatives Considered
Amazon AWS
Snowflake is a lot easier to get started with than the other options. Snowflake's data lake building capabilities are far more powerful. Although Amazon EMR isn't our first pick, we've had an excellent experience with EC2 and S3. Because of our current API interfaces, it made more sense for us to continue with Hadoop rather than explore other options.
Read full review
Cloudera
We chose [Hortonworks Data Platform] because it's free and because [it] was an IBM partner, suggested as big data platform after biginsights platform.
You can install in more physical computer without high specs, then you can use it in order to learn how to deploy, configure a complete big data cluster.
We installed also in a cloud infrastructure of 5 virtual machine
Read full review
Return on Investment
Amazon AWS
  • Positive: Helped process the jobs amazingly fast.
  • Positive: Did not have to spend much time to learn the system, therefore, saving valuable research time.
  • Negative: Not flexible for some scenarios, like when some plugins are required, or when the project has to be moved in-house.
Read full review
Cloudera
  • It is difficult to have a negative impact, because the required investment is not that high.
  • The big open community behind Hortonworks and related Apache Project makes it easy to put 'the wheel to meet the road' quite quickly.
  • We have seen management meetings where the attendants were impressed by the results achieved with the datalake built on HDP.
Read full review
ScreenShots