Page 2 of 2 Apache Hive Reviews & Ratings 2024

Overview

What is Apache Hive?

Apache Hive is database/data warehouse software that supports data querying and analysis of large datasets stored in the Hadoop distributed file system (HDFS) and other compatible systems, and is distributed under an open source license.

Recent Reviews

TrustRadius Insights

December 15, 2023

Apache Hive is a versatile software that has been widely used across various departments and organizations for different use cases. It has …

With Apache Hive, you can enter the world of Big Data

8 out of 10

July 06, 2022

On-premises large data processing is handled by Apache Hive, which is running on Cloud ERA Servers. In order to use Apache Hive, you must …

Best Distributed Database in the market

6 out of 10

April 19, 2022

Incentivized

We use Apache Hive to store a large set of data, which are huge documents such as problem statements and its answer, not only submitted by …

Help your dev team !

8 out of 10

April 12, 2022

Incentivized

We build our data lake and perform queries on large amounts of data. We group data from multiple sources into a common structure, making …

Spectacular SQL-like interface for accessing Hadoop

9 out of 10

April 11, 2022

Incentivized

To manage and view Apache Hadoop data in a SQL-like format To be able to query databases across the organization, quickly To query data …

This system makes active data of value.

8 out of 10

April 09, 2022

Incentivized

We have used the system to migrate data either for new versions or because we will use another operating program, the software helps us to …

Best query platform for ETL.

6 out of 10

April 08, 2022

Incentivized

I used Apache Hive on top of Hadoop for filtering and cleaning data using SQL. It was the part of the project which I was working on. …

It is an advance to the ease of the processes

8 out of 10

April 08, 2022

Incentivized

The software is intuitive from the first steps, one of the first features we take into account for the software does not allow duplicate …

Capabilities of Apache Hive

8 out of 10

April 07, 2022

Incentivized

Main purpose for using Apache Hive was to get the insights from data. Analyzing the data and use it to take informed business decisions. …

Excellent bigdata warehouse solution

9 out of 10

April 07, 2022

Incentivized

Apache Hive is an open-source data warehouse solution built on top of Hadoop that helps to analyze a very large amount of data.
Our use …

very useful for OLTP

10 out of 10

April 06, 2022

Incentivized

We use Apache to process large data and get the output with less process time. The framework is very much useful for data processing and …

Apache Hive

9 out of 10

November 24, 2021

Incentivized

1. Used Apache Hive to create external and internal tables in Hadoop / BigData projects on Cloudera and Azure platforms. 2. Apache Hive …

Walk into the World of Big Data with Apache Hive

9 out of 10

June 02, 2021

Incentivized

We are using Apache Hive over an on-premise big data setup built on top of Cloud ERA Servers. Use case behind using Apache Hive [it] is …

Reliable and Cheaper one stop Data warehouse solution

9 out of 10

December 28, 2020

Incentivized

I have used Apache Hive in [the] last 3 companies and it's being used by the multiple departments spread across data analytics, …

Big Data the SQL way

8 out of 10

September 23, 2020

Incentivized

I am working as a Research Assistant where I have to process tons of data to produce appropriate findings. Our NLP lab used it for all its …

Apache Hive: Big data querying tool w/SQL interface, but slower, more costly computation

7 out of 10

September 21, 2020

Incentivized

We use Apache Hive to make data-driven decisions. It is used from finance to engineering to sales. It helps aggregate our massive data …

Read all reviews

Awards

Products that are considered exceptional by their customers based on a variety of criteria win TrustRadius awards. Learn more about the types of TrustRadius awards to make the best purchase decision. More about TrustRadius Awards

Return to navigation

Pricing

View all pricing

Apache Hive

N/A

Unavailable

What is Apache Hive?

Entry-level set up fee?

No setup fee

Offerings

Free Trial
Free/Freemium Version
Premium Consulting/Integration Services

Would you like us to let the vendor know that you want pricing?

24 people also want pricing

Alternatives Pricing

ClicData

$79

per month

What is ClicData?

ClicData is a 100% cloud-based business intelligence platform that allows users to connect, process, blend, visualize and share data from a single place. As an automated platform, users are able to rely on the latest version of company data, to ensure users make the right decisions. Hundreds of…

retailMetrix

$399

per month per installation

What is retailMetrix?

RetailMetrix is a data analytics platform for retailers with the mission of enabling retailers to get value from their data. RetailMatrix processes and stores sales, labor and customer data using data warehouse technologies. Its dashboards and reports allows team to find the data that matters to…

Return to navigation

Product Demos

Apache Hive Hadoop Ecosystem - Big Data Analytics Tutorial by Mahesh Huddar

YouTube

Connecting Microsoft Power BI to Apache Hive using Simba Hive ODBC driver

YouTube

Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive

YouTube

Return to navigation

Product Details

About
Tech Details
FAQs

What is Apache Hive?

Apache Hive Technical Details

Operating Systems	Unspecified
Mobile Application	No

Frequently Asked Questions

Reviewers rate Usability highest, with a score of 8.5.

The most common users of Apache Hive are from Enterprises (1,001+ employees).

Return to navigation

Comparisons

View all alternatives

Compare with

Reviews and Ratings

(97)

May 7th 2024

Community Insights

TrustRadius Insights are summaries of user sentiment data from TrustRadius reviews and, when necessary, 3rd-party data sources. Have feedback on this content? Let us know!

Business Problems Solved

Apache Hive is a versatile software that has been widely used across various departments and organizations for different use cases. It has proven to be particularly helpful in handling large datasets, migrating data between different operating systems, synchronizing programs, and fetching and generating product metrics. Users have found value in using Hive for data analytics, engineering, data science, product management, and IT-related tasks such as improving analysis of big datasets stored in Hadoop HDFS.

Furthermore, Apache Hive has simplified the process of filtering and cleaning data using SQL, reducing the learning curve for handling big data. It allows users to run SQL queries against data in Hadoop, enabling efficient analysis of large datasets without the need to learn a new language. Additionally, Hive has been utilized for building reports, analyzing data stored in the Hadoop file system, processing events gathered in HDFS, and converting them into parquet files for fast querying.

Overall, users have praised Apache Hive for its scalability, accessibility, and cost-effectiveness in storing and retrieving analytics data. It has provided an intuitive solution for storing large datasets, querying big sets of data using SQL, aggregating massive datasets into distilled information for data-driven decision making, and creating external and internal tables in Hadoop/BigData projects. With its ability to process both unstructured and structured data efficiently, Hive has become an essential tool for data analysts, engineers, and business analysts across organizations.

Attribute Ratings

Reviews

(26-35 of 35)

Sort By *

Companies can't remove reviews or game the system. Here's why

September 13, 2017

Apache Hive Review

Sameer Gupta

Senior Data Analyst

SurveyMonkey (Internet, 501-1000 employees)

Score 8 out of 10

Vetted Review

Verified User

Incentivized

Use Cases and Deployment Scope

Hive is currently being used across the entire analytics organization at SurveyMonkey. The business problem that we solve through it is, accessing/storing large data sets(typically logs), in a scalable and accessible place.

Pros and Cons

SQL like query engine, allows easy ramp up from a standard RDBMS
Scalability is great
If properly configured the data retreival is fantastic

The way we currently have it implemented is quite slow, but I believe that's more of our implementation
Joins tend to be slow

Likelihood to Recommend

I think Apache hive is great for a company just stepping into the big data realm. I think the fact that it's open source allows for a variety of tools to be integrated. The fact that it has HiveQL makes for a great transition from a standard RDMS to a big data tool. This can be very nice in terms of cost savings as the ramp up time for an analyst will be quite low.

September 11, 2017

Apache Hive for ETL workloads

Verified User

Analyst in Engineering

Hospital & Health Care Company, 501-1000 employees

Score 5 out of 10

Vetted Review

Verified User

Incentivized

Use Cases and Deployment Scope

Apache Hive is being using across our organisation for analytical workloads. We use Hive along with Hortonworks distribution and it's a great SQL on Hadoop tool.

Pros and Cons

Hive is good for ETL workloads on Hadoop.
HiveQL translates SQL like queries into map reduce jobs.It supports custom map reduce scripts to plugged in.
Hive has two kinds of tables- Hive managed tables and external tables.
Use external table when other applications like pig, sqoop or mapareduce also using the file in hdfs. Once we delete the external table from Hive, it just deletes the metadata from Hive and original file in hdfs stays.

Use Hive for analytical work loads. Write once and read many scenarios. Do not prefer updates and deletes.
Behind scenes Hive creates map reduce jobs. Hive performance is slow compared to Apache Spark.
Map reduce writes the intermediate outputs to dial whereas Spark operates in in-memory and uses DAG.

Likelihood to Recommend

Use it for ETL workloads. I prefer repeat the same workload with Spark and decide the better performance

April 26, 2017

Apache Hive - Querying Big Data Made Easy!

Verified User

Engineer in Engineering

Computer Software Company, 51-200 employees

Score 8 out of 10

Vetted Review

Verified User

Incentivized

Use Cases and Deployment Scope

We use Apache Hive for two main use cases, analyzing our ever growing data volume insights and reports, and as part of our ETL pipeline where we found writing in SQL like syntax to allow for more rapid development with low complexity to the overall system.

Apache Hive solves a few issues for us but the main one being the ability to analyze large volumes of data on S3 directly with overall strong performance. We have been able to analyze billions of records in a matter of minutes with relatively small EC2 cluster using Apache Hive. It also allows for our Data Analysts to simply write SQL and avoids the ramp up to use other tools such as Apache Pig.

Pros and Cons

Apache Hive allows use to write expressive solutions to complex problems thanks to its SQL-like syntax.
Relatively easy to set up and start using.
Very little ramp-up to start using the actual product, documentation is very thorough, there is an active community, and the code base is constantly being improved.

Debugging can be messy with ambiguous return codes and large jobs can fail without much explanation as to why.
Hive is only SQL-like, while more features are being added we have found that some things do not translate over (for example outer joins, inserts, columns can only be referenced once in a select, etc.).
For out ETL jobs it does not seem to be the optimal tool due to tunings and performance being difficult, Apache Pig may be better for heavy processing jobs.

Likelihood to Recommend

Apache Hive shines for ad-hoc analysis and plugging into BI tools. Its SQL-like syntax allows for ease of use not for only for engineers but also for data analysts. Through our experience, there are probably more desirable tools to use if you are planning on integrating Hive into your processing pipeline.

February 28, 2017

Hive Away, but not for everything!

Praveen Murugesan

Engineering Manager - Ride Experience

Uber (Internet, 5001-10,000 employees)

Score 6 out of 10

Vetted Review

Verified User

Incentivized

Use Cases and Deployment Scope

We use apache hive across the whole organization. We built our own in-house hadoop cluster for data warehousing purposes complementary to HP Vertica which we were using. Vertica is limited to scale, and to achieve true scalability and process trillions of records we had to invest in a new solution. Enter Apache Hive. We are very data driven as an organization and hence to satisfy to appetite of people and also stick to something familiar to query data (SQL) we decided to invest in Apache Hive as a starting point in our new data infrastructure.

Pros and Cons

Hive which leverages traditional MapReduce at the core, can be used to process a large amount of data without a problem. Any problem that can be solved with MapReduce can now be simply expressed in SQL.
Hive leverages the disk in the case of processing large data and is not limited by physical memory of any one machine (which is a limitation for systems like Presto). Hence it even allows reasonable fact-fact cross joins.
Hive is extensible with UDFs. For any common patterns you can quickly write your own function set and it can be leveraged by everyone.

Compute Speed - Hive will be my last option to query vs. something like Presto, which has a much smarter query engine. Hive is slow, and I'd use it only if we cannot use something like Presto/Impala.
SQL syntax of hive is unique and does not conform to ANSI SQL. This is quite painful for beginners.
The ability to upsert records would be nice to have. Hive is cumbersome for mutable data where partitions require them to be rewritten. No one has solved this really well. If this is solved - it could be leveraged by many systems.

Likelihood to Recommend

Process large datasets (especially joins of two large datasets, cross joins etc). Hive is not well suited for generic queries on one table and it can still be very slow. There are better solutions for that (Presto, Impala).

February 14, 2017

Easy access to data in Hadoop

Verified User

Analyst in Marketing

Internet Company, 501-1000 employees

Score 8 out of 10

Vetted Review

Verified User

Incentivized

Use Cases and Deployment Scope

Apache Hive is primarily used by data analysts and data engineers at our company. We store most of our data in Hadoop and Apache Hive allows us to access the data faster than by writing MapReduce jobs.

Pros and Cons

Faster than writing MapReduce or scalding jobs to access data in Hadoop.
Syntax is essentially the same as that of SQL, making the barriers for entry to start using data low.

Apache Hive can be quite slow and is not suitable for interactive querying. Simple queries will take many minutes and more complex queries can take a very long time to finish running.

Likelihood to Recommend

Apache Hive is suitable for allowing easy access to data stored in Hadoop via a familiar SQL syntax. It is more suitable for one-off data pulls and less suitable for interactive querying due to its speed. For a better interactive querying experience, a solution like Presto would be more suitable.

September 16, 2016

Hive, last generation tooling but revolutionary for it's time.

Verified User

Consultant in Sales

Computer Software Company, 201-500 employees

Score 5 out of 10

Vetted Review

Verified User

Incentivized

Use Cases and Deployment Scope

Hive was once a part of our platform but it never lived up to the promise of performant SQL on HDFS and thus was only truly useful for the users who didn't have the expertise or time to write MapReduce. With the advent of Spark, Hive's time is numbered and I would not invest in learning it specifically but instead use SparkSQL which has some of the better parts of Hive under the covers along with Spark's better execution engine.

Pros and Cons

Connect BI tools to non relational data stores
Simplify writing legacy MapReduce

Speed needs to be a lot better
Concurrency is not up to snuff

Likelihood to Recommend

Hive is mostly useful in HDFS environments where legacy BI tools need to access the data. This is ok if there is a low concurrency of users but will fall over with any significant multi-user environment.

September 13, 2016

As sweet as Honey - Apache Hive

Venkata Mallepudi

Data Scientist

M2Catalyst, LLC (Wireless, 11-50 employees)

Score 9 out of 10

Vetted Review

Verified User

Incentivized

Use Cases and Deployment Scope

Apache Hive is used for data processing and analysis in the company that I am working for. Apache Hive is being used by the IT department and the results it produces are shared across the whole organization. Performing operations on terabytes of data has become easy without worrying much about the complexity involved. Similarity with SQL related tools has increased the difficulty in looking for employees with big-data skills.

Pros and Cons

Apache Hive works extremely well with large data sets. Analysis over a large data set (Example: 1PB of data) is made easy with hive.
User-defined functions gives flexibility to users to define operations that are used frequently as functions.
String functions that are available in hive has been extensively used for analysis.

Joins (especially left join and right join) are very complex, space consuming and time consuming. Improvement in this area would be of great help!
Having more descriptive errors help in resolving issues that arise when configuring and running Apache Hive.

Likelihood to Recommend

Apache Hive is well suited in situations where doing aggregations would be very time consuming. Apache Hive returns results faster than many other applications.

Latency that exists when working with small data sets is a situation that needs to be looked at. Apache Hive is less appropriate in that scenario.

May 25, 2016

Hive brings the power of SQL to Hadoop

Tom Thomas

Student Lab Instructor (SLI) for Computer Science II

Rochester Institute of Technology (Higher Education, 1001-5000 employees)

Score 9 out of 10

Vetted Review

Verified User

Incentivized

Use Cases and Deployment Scope

I have used Hive at an enterprise company where I interned. It was being used by the IT department to improve analysis of large datasets stored in the company's Hadoop HDFS. It was also being used because of its support for HiveQL which is a SQL like language enabling queries on large datasets. It also reduced the learning curve for handling big data because of HiveQL's similarity to SQL.

Pros and Cons

Supports SQL like queries
Various storage types including RCFile, HBase, ORC, etc.
Supports indexing for acceleration

HiveQL does not have all the features of SQL
No support for transactions

Likelihood to Recommend

Hive is very well suited for large enterprise businesses that rely on Hadoop for efficient processing of big data in a distributed cluster. HiveQL also brings familiarity of SQL which speeds up the learning process for new users. However, Hive is not an ideal option for a business where data is frequently changing and dynamic.

April 20, 2016

HiveQL, Almost SQL, but not quite.

Verified User

Engineer in Information Technology

Internet Company, 201-500 employees

Score 8 out of 10

Vetted Review

Verified User

Incentivized

Use Cases and Deployment Scope

Hive is being used to put an SQL interface to our Hadoop cluster. This works well because most of our organization is very SQL friendly, so when introducing a new technology, such as Hadoop, the technical users are easily able to adapt to the new technology with no problem.

Pros and Cons

Run SQL queries to an Hadoop cluster.
Many different consoles can use it.
Users don't have to write map reduce.

Hive needs more SQL support.
Enabling more date functions.
Enabling more SQL table functions, such as inserting into a temp table.

Likelihood to Recommend

Apache Hive is well suited for pulling data for reporting environments or ad-hoc querying analysis to an Hadoop cluster. I believe Apache Hive is not well suited for running large big data jobs when needing fast performance. It can be best utilized on scheduled jobs where fast performance is not required. However, this can greatly depend on how the Hadoop cluster is set up.

December 22, 2014

Hive, a very powerful open source data warehouse solution.

Yinghua Hu

Senior Data Scientist

CARD.com (Financial Services, 11-50 employees)

Score 10 out of 10

Vetted Review

Verified User

Use Cases and Deployment Scope

Hive is used by data team to store the largest datasets of the company. Data is partitioned in Hive and can be queried by Impala.

Pros and Cons

Partition to increase query efficiency.
Serde to support different data storage format.
Integrate well with Impala and data can be queried by Impala.
Support of parquet compression format

Speed is slower compared to Impala since it uses map reduce

Likelihood to Recommend

Hive is a data warehouse and it does not allow for updates and deletions. If data needs to be updated frequently, it might not be the best storage solution for that purpose.

Return to navigation

Apache Hive

ClicData

retailMetrix

Apache Hive Hadoop Ecosystem - Big Data Analytics Tutorial by Mahesh Huddar

Connecting Microsoft Power BI to Apache Hive using Simba Hive ODBC driver

Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive

Oracle Autonomous Data Warehouse

Oracle Exadata

SAP BW

SAP BW/4HANA

Cloudera Enterprise Data Hub

IBM Netezza Performance Server

OpenText Vertica

Cloudera Data Platform

1010data

Community Insights