Was this helpful?

(0) (0)

apache-hive-vs-presto

August 21st, 2020 3 min read

Apache Hive and Presto are both analytics engines that businesses can use to generate insights and enable data analytics.  Apache Hive is a data warehousing tool designed to easily output analytics results to Hadoop.  In contrast, Presto is built to process SQL queries of any size at high speeds.  Both tools are most popular with mid sized businesses and larger enterprises that perform a large volume of SQL queries.

Features

Apache Hive and Presto both enable organizations to perform queries on business data, but they also have some standout features that set them apart from each other.

Apache Hive is designed to facilitate analytics on large amounts of data, while also providing storage for the results in the form of tables.  Businesses using Hadoop will appreciate that Apache Hive is built on top of the Hadoop File System, making it easy to integrate Apache Hive into their existing infrastructure. Businesses will get the most out of Apache Hive if they are performing ad-hoc queries on large datasets.

Presto is an open source sql query engine that can manage and run both simple, small queries, as well as large, complex queries.  Businesses will appreciate that Presto can run queries at high speeds, making it a good choice for businesses that want to run a lot of queries without being delayed.  It is worth noting, that for businesses using Hadoop that want the high query speed offered by Presto, it does include an integration with Apache Hive.

Limitations

Apache Hive and Presto are both popular choices for businesses seeking analytics engines, with some even using both, but they also have some limitations that are important to consider.

Apache Hive provides excellent support for large datasets and businesses that use Hadoop, but it can’t run SQL queries as fast as Presto.  Businesses looking for the fastest option available may need to consider other options.  Additionally, Apache Hive includes built in support for Hadoop, but businesses using other tools will not be able to take advantage of those benefits.

Presto provides fast support for SQL queries, but it doesn’t include built in support for the Hadoop File System, and requires other tools to function for that use case.  Businesses looking for a quick solution that works with Hadoop out of the box may prefer Apache Hive.  Additionally, businesses less concerned with scalability and maximum query speed may prefer the support for large datasets provided by Apache Hive.

Pricing

Apache Hive and Presto are both open source tools, so the source code for each one is available for free. 

Was this helpful?

(0) (0)

TrustRadius Weekly