What users are saying about
25 Ratings
9 Ratings
25 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 8.2 out of 101
9 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 7.6 out of 101

Add comparison

Likelihood to Recommend

Amazon EMR

Well suited if you quickly want to setup a distributed compute platform, such as Spark. But you have to be advanced enough that you really want to separate compute from data storage. For example, for certain applications packaged solution such as MPP databases (e.g. Redshift) is much easier to set up that Spark on EMR and S3 with the appropriate file formats.
No photo available

Presto

Presto is for interactive simple queries, where Hive is for reliable processing. If you have a fact-dim join, presto is great..however for fact-fact joins presto is not the solution.. Presto is a great replacement for proprietary technology like Vertica
Praveen Murugesan profile photo

Pros

  • Ease of use and ease to setup
  • Autoscaling functionality
  • Integrated into the AWS environment
No photo available
  • Fast - Presto, is incredibly fast due to its optimized query engine and is well suited for interactive analysis.
  • Flexible - Presto is highly flexible as it operates with a plug and play model for data sources. Joining and query across different data sources is very easy with presto (eg. HDFS, MySQL, Kafka).
  • ANSI Sql - Presto follows ANSI SQL which is the recognized SQL language and hence helps allow easy query migration without much overhead.
  • Large Fact + Small Dimension table joins made fast - By design presto excels most distributed query engines out there in this type of queries.
Praveen Murugesan profile photo

Cons

  • The analytical processes generally run quicker with the standalone tools of Hadoop, Spark, and others. If you only use one big data tool and don't really need things simplified, then Elastic MapReduce is more of an overhead tool that doesn't add much value.
  • The analytical capabilities of Elastic MapReduce are nowhere near as complex or broad as non-big data tools. I would suggest not using the tool unless your data really is big data.
  • The machine learning capabilities of Elastic MapReduce (using the big data tools of Hadoop/Spark) are good but are not as easy to use as other machine learning tools.
Thomas Young profile photo
  • Presto was not designed for large fact fact joins. This is by design as presto does not leverage disk and used memory for processing which in turn makes it fast.. However, this is a tradeoff..in an ideal world, people would like to use one system for all their use cases, and presto should get exhaustive by solving this problem.
  • Resource allocation is not similar to YARN and presto has a priority queue based query resource allocation..so a query that takes long takes longer...this might be alleviated by giving some more control back to the user to define priority/override.
  • UDF Support is not available in presto. You will have to write your own functions..while this is good for performance, it comes at a huge overhead of building exclusively for presto and not being interoperable with other systems like Hive, SparkSQL etc.
Praveen Murugesan profile photo

Alternatives Considered

Having one of these enterprise edition license comes at its own costs. But, the flexibility to have the cluster spin up with the workbenches and code snippets on the same is really beneficial. Especially, if one had to move out of EMR and consider an option which reduces the debugging time in establishing connections to AWS resources, I would love to used the mentioned three resources on EC2. This would definitely make the processing time to reduce as there is a flexibility to test real time and execute the code snippet and look at the performance and monitor the snippet in real time.
No photo available
I think Presto is one of the best solutions out there today at the cutting edge for interactive query analysis. One of the challenges is presto is a niche tool for the interactive query use case and doesn't have the knobs and whistles as much as Spark. In the foreseeable future if they are able to make presto work without the need for Hive, solving all the gaps it could be game changing and can be a direct threat to spark
Praveen Murugesan profile photo

Return on Investment

  • Amazon Elastic MapReduce has had a positive ROI in the sense that it saved time managing big data projects where analysts were using different big data tools. Essentially, an increase in employee productivity.
  • Elastic MapReduce is not worth it in cases where you're just trying things out. You'll likely lose money unless you're sure that using MapReduce is a good idea.
  • Elastic MapReduce takes some time learning, although not too much. If the employee is less well-versed in big data analytics, the software is a high hill to climb that eats up employee time.
Thomas Young profile photo
  • Presto has helped scale Uber's interactive data needs. We have migrated a lot out of proprietary tech like Vertica.
  • Presto has helped build data driven applications on its stack than maintain a separate online/offline stack.
  • Presto has helped us build data exploration tools by leveraging it's power of interactive and is immensely valuable for data scientists.
Praveen Murugesan profile photo

Pricing Details

Amazon EMR

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details

Presto

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details