Skip to main content
TrustRadius
Apache Hive

Apache Hive

Overview

What is Apache Hive?

Apache Hive is database/data warehouse software that supports data querying and analysis of large datasets stored in the Hadoop distributed file system (HDFS) and other compatible systems, and is distributed under an open source license.

Read more
Recent Reviews

TrustRadius Insights

Apache Hive is a versatile software that has been widely used across various departments and organizations for different use cases. It has …
Continue reading

Help your dev team !

8 out of 10
April 12, 2022
Incentivized
We build our data lake and perform queries on large amounts of data. We group data from multiple sources into a common structure, making …
Continue reading

very useful for OLTP

10 out of 10
April 06, 2022
Incentivized
We use Apache to process large data and get the output with less process time. The framework is very much useful for data processing and …
Continue reading

Big Data the SQL way

8 out of 10
September 23, 2020
Incentivized
I am working as a Research Assistant where I have to process tons of data to produce appropriate findings. Our NLP lab used it for all its …
Continue reading
Read all reviews

Awards

Products that are considered exceptional by their customers based on a variety of criteria win TrustRadius awards. Learn more about the types of TrustRadius awards to make the best purchase decision. More about TrustRadius Awards

Return to navigation

Pricing

View all pricing
N/A
Unavailable

What is Apache Hive?

Apache Hive is database/data warehouse software that supports data querying and analysis of large datasets stored in the Hadoop distributed file system (HDFS) and other compatible systems, and is distributed under an open source license.

Entry-level set up fee?

  • No setup fee

Offerings

  • Free Trial
  • Free/Freemium Version
  • Premium Consulting/Integration Services

Would you like us to let the vendor know that you want pricing?

24 people also want pricing

Alternatives Pricing

What is ClicData?

ClicData is a 100% cloud-based business intelligence platform that allows users to connect, process, blend, visualize and share data from a single place. As an automated platform, users are able to rely on the latest version of company data, to ensure users make the right decisions. Hundreds of…

What is retailMetrix?

RetailMetrix is a data analytics platform for retailers with the mission of enabling retailers to get value from their data. RetailMatrix processes and stores sales, labor and customer data using data warehouse technologies. Its dashboards and reports allows team to find the data that matters to…

Return to navigation

Product Demos

Apache Hive Hadoop Ecosystem - Big Data Analytics Tutorial by Mahesh Huddar

YouTube

Connecting Microsoft Power BI to Apache Hive using Simba Hive ODBC driver

YouTube

Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive

YouTube
Return to navigation

Product Details

Apache Hive Technical Details

Operating SystemsUnspecified
Mobile ApplicationNo

Frequently Asked Questions

Apache Hive is database/data warehouse software that supports data querying and analysis of large datasets stored in the Hadoop distributed file system (HDFS) and other compatible systems, and is distributed under an open source license.

Reviewers rate Usability highest, with a score of 8.5.

The most common users of Apache Hive are from Enterprises (1,001+ employees).
Return to navigation

Comparisons

View all alternatives
Return to navigation

Reviews and Ratings

(97)

Community Insights

TrustRadius Insights are summaries of user sentiment data from TrustRadius reviews and, when necessary, 3rd-party data sources. Have feedback on this content? Let us know!

Apache Hive is a versatile software that has been widely used across various departments and organizations for different use cases. It has proven to be particularly helpful in handling large datasets, migrating data between different operating systems, synchronizing programs, and fetching and generating product metrics. Users have found value in using Hive for data analytics, engineering, data science, product management, and IT-related tasks such as improving analysis of big datasets stored in Hadoop HDFS.

Furthermore, Apache Hive has simplified the process of filtering and cleaning data using SQL, reducing the learning curve for handling big data. It allows users to run SQL queries against data in Hadoop, enabling efficient analysis of large datasets without the need to learn a new language. Additionally, Hive has been utilized for building reports, analyzing data stored in the Hadoop file system, processing events gathered in HDFS, and converting them into parquet files for fast querying.

Overall, users have praised Apache Hive for its scalability, accessibility, and cost-effectiveness in storing and retrieving analytics data. It has provided an intuitive solution for storing large datasets, querying big sets of data using SQL, aggregating massive datasets into distilled information for data-driven decision making, and creating external and internal tables in Hadoop/BigData projects. With its ability to process both unstructured and structured data efficiently, Hive has become an essential tool for data analysts, engineers, and business analysts across organizations.

Attribute Ratings

Reviews

(1-25 of 32)
Companies can't remove reviews or game the system. Here's why
Score 8 out of 10
Vetted Review
Verified User
To query a huge, distributed dataset, Apache Hive was built by Facebook. Unlike Apache Hive, Apache Spark is an in-memory computation engine, which is why it is significantly quicker than Apache Hive at querying large amounts of data. In contrast to Apache HBase, Apache Hive is better suited for dealing with structured data stored on HDFS.
April 12, 2022

Help your dev team !

Score 8 out of 10
Vetted Review
Verified User
Incentivized
Community support and ease of use -not deployment.

It enables querying and analyzing large amounts of data stored in HDFS, on the petabyte scale. It has a query language called HQL that transforms SQL queries into MapReduce jobs that run on Hadoop, and it is wonderful for the devs team that love it.
Score 9 out of 10
Vetted Review
Verified User
Incentivized
Apache Spark is similar in the sense that it too can be used to query and process large amounts of data through its Dataframe interface. Hive is better for short-term querying while Spark is better for persistent and long-term analysis. Another product is Impala. For our purposes, Impala and Hive were similar, but in general, Impala is better for real-time analysis.
Camilo Palacios | TrustRadius Reviewer
Score 8 out of 10
Vetted Review
Verified User
Incentivized
We have used a simple but necessary function such as merging certain data tables, which although they may be from different areas, complement each other or are necessary, you can use metadata if what you need is to validate the origin of your information and what impact it has, is also feasible.
Omkar Marne | TrustRadius Reviewer
Score 6 out of 10
Vetted Review
Verified User
Incentivized
Apache Hadoop is built on top of the Hadoop File system so it gives its best when integrated with Hadoop. Data analysis and query optimization become very easy when used with Hadoop to perform Extract transform load operations. As Hadoop is a big data system and handles large data sizes, Apache hive can query large data with less time complexity.
Pablo Gonzalez | TrustRadius Reviewer
Score 8 out of 10
Vetted Review
Verified User
Incentivized
We have used the system to migrate data either for new versions or because we will use another operating program, the software helps us to synchronize programs between different operating systems, a history of information can be kept constant, it can be sent to third parties the information already transformed
Score 8 out of 10
Vetted Review
Verified User
Incentivized
Queries are easy to write and interface is similar to SQL so learning overhead is reduced. Multi user and data type support is provided. Can be easily scaled for very large amount of analytics. It is very flexible in terms of using file formats.
akshay kashyap | TrustRadius Reviewer
Score 9 out of 10
Vetted Review
Verified User
Incentivized
Apache Hive is a query language developed by Facebook to query over a large distributed dataset. Apache is a query engine that runs on top of HDFS, so it utilizes the resources of HDFS Hadoop setup, while Apache Spark is an in memory compute engine, and that's why [it is] much faster than Apache Hive. While Apache HBase mostly deals with unstructured data, while Apache Hive is suitable using structured data stored on HDFS.
Manjeet Singh | TrustRadius Reviewer
Score 9 out of 10
Vetted Review
Verified User
Incentivized
Besides Hive, I have used Google BigQuery, which is costly but have very high computation speed.
Amazon Redshift is the another product, I used in my recent organisation.
Both Redshift and BigQuery are managed solution whereas Hive needs to be managed
September 23, 2020

Big Data the SQL way

Score 8 out of 10
Vetted Review
Verified User
Incentivized
Hive and Spark have the same parent company hence they share a lot of common features. Hive follows SQL syntax while Spark has support for RDD, DataFrame API. DataFrame API supports both SQL syntax and has custom functions to perform the same functionality. Spark is faster and can run on distributed systems while Hive is slower.
Ananth Gouri | TrustRadius Reviewer
Score 9 out of 10
Vetted Review
Verified User
Incentivized
One of the major advantages of using Presto or the main reason why people use Presto (Teradata) is due to that fact it can support multiple data sources - which is lacking as in the case of Apache Hive. But still, most people who come from a Structured data-based background like the old days of Dbase, or the later ones of SQL databases like MS SQL, MySQL, PostgreSQL - may still opt to go with Apache Hive for its HiveQL ease and functionality.
Nicolas Hubert | TrustRadius Reviewer
Score 9 out of 10
Vetted Review
Verified User
Incentivized
Easy to understand, well supported by the community, good documentation. However, it is possible that SAP Business Warehouse could be a good fit, too, even maybe better. I did not have the chance to try it though. We selected Apache Hive because it was far less expensive and handled all the tasks we wanted to perform with it.
Jordan Moore | TrustRadius Reviewer
Score 7 out of 10
Vetted Review
Verified User
Incentivized
Hive was one of the first SQL on Hadoop technologies, and it comes bundled with the main Hadoop distributions of HDP and CDH. Since its release, it has gained good improvements, but selecting the right SQL on Hadoop technology requires a good understanding of the strengths and weaknesses of the alternative options.
Tejaswar Rao | TrustRadius Reviewer
Score 9 out of 10
Vetted Review
Verified User
Incentivized
  • Faster response time and also can handle complex analytical queries
  • Can able to write custom function using python and hive
  • Able to connect using hadoop components and also using R
  • Can handle different data formats
  • Can use Structured Query language to access the data
Bharadwaj (Brad) Chivukula | TrustRadius Reviewer
Score 9 out of 10
Vetted Review
Verified User
Incentivized

For storing bulk amount of data in a tabular manner, and where there's no need need of primary key, or just in case, if redundant data is received, it will not cause a problem. For small amounts of data, it does run MR, so beware. If your intention is to use it as a transactional records, then do not go with it. Explore other tools like Spark also as many of the features that Hive does is now supported by Spark.


September 13, 2017

Apache Hive Review

Sameer Gupta | TrustRadius Reviewer
Score 8 out of 10
Vetted Review
Verified User
Incentivized
I wasn't part of the evaluation process for Apache Hive. This was already implemented when I joined the company. I have worked with other big data plaftforms and I personally thinks most of them are quite comporable to one another. It really depends on what the company is going for. For exampel Google Cloud makes a ton of sense for a user if they developed their application on Google App Engine.
Return to navigation