Apache Hive

Apache Hive

Customer Verified
About TrustRadius Scoring
Score 8.2 out of 100
Apache Hive

Overview

Recent Reviews

Help your dev team !

8 out of 10
April 12, 2022
We build our data lake and perform queries on large amounts of data. We group data from multiple sources into a common structure, making …
Continue reading

Capabilities of Apache Hive

8 out of 10
April 07, 2022
Main purpose for using Apache Hive was to get the insights from data. Analyzing the data and use it to take informed business decisions. …
Continue reading

very useful for OLTP

10 out of 10
April 06, 2022
We use Apache to process large data and get the output with less process time. The framework is very much useful for data processing and …
Continue reading

Big Data the SQL way

8 out of 10
September 23, 2020
I am working as a Research Assistant where I have to process tons of data to produce appropriate findings. Our NLP lab used it for all its …
Continue reading

Reviewer Pros & Cons

View all pros & cons

Video Reviews

Leaving a video review helps other professionals like you evaluate products. Be the first one in your network to record a review of Apache Hive, and make your voice heard!

Pricing

View all pricing
N/A
Unavailable

What is Apache Hive?

Apache Hive is database/data warehouse software that supports data querying and analysis of large datasets stored in the Hadoop distributed file system (HDFS) and other compatible systems, and is distributed under an open source license.

Entry-level set up fee?

  • No setup fee

Offerings

  • Free Trial
  • Free/Freemium Version
  • Premium Consulting / Integration Services

Would you like us to let the vendor know that you want pricing?

8 people want pricing too

Alternatives Pricing

What is Oracle Exadata?

Oracle Exadata is software and hardware engineered to support high-performance running of Oracle databases.

What is Cloudera Data Platform?

Cloudera Data Platform (CDP), launched September 2019, is designed to combine the best of Hortonworks and Cloudera technologies to deliver an enterprise data cloud. CDP includes the Cloudera Data Warehouse and machine learning services as well as a Data Hub service for building custom business…

Features Scorecard

No scorecards have been submitted for this product yet..

Product Details

What is Apache Hive?

Apache Hive is database/data warehouse software that supports data querying and analysis of large datasets stored in the Hadoop distributed file system (HDFS) and other compatible systems, and is distributed under an open source license.

Apache Hive Technical Details

Operating SystemsUnspecified
Mobile ApplicationNo

Comparisons

View all alternatives

Frequently Asked Questions

What is Apache Hive?

Apache Hive is database/data warehouse software that supports data querying and analysis of large datasets stored in the Hadoop distributed file system (HDFS) and other compatible systems, and is distributed under an open source license.

What is Apache Hive's best feature?

Reviewers rate Usability highest, with a score of 8.7.

Who uses Apache Hive?

The most common users of Apache Hive are from Enterprises (1,001+ employees) and the Computer Software industry.

Reviews and Ratings

 (100)

Ratings

Reviews

(1-25 of 36)
Companies can't remove reviews or game the system. Here's why
Score 9 out of 10
Vetted Review
Verified User
Review Source
Data warehouses that update and append records in batches or real time can be queried using Apache Hive. Tableau and other reporting tools may be used straight from Python searches on Apache data sets. Structured data and tables may be accessed using SQL-like syntax. Using a hive, you may build tables at various levels of the Data Lake. Transactional databases are not the best fit.
April 12, 2022

Help your dev team !

Score 8 out of 10
Vetted Review
Verified User
Review Source
It is great for laboratory environments and to start working with unstructured data about which we are not very clear about how we want to treat it. It also allows queries to be improved very quickly by allowing developers to work with SQL instead of map-reduce. As an improvement, in productive environments, troubleshooting is complicated and requires expert personnel.
Score 9 out of 10
Vetted Review
Verified User
Review Source
Apache Hive is well-suited for querying Hadoop. If you use Hadoop you should consider Hive. It is well-suited for large organizations where there is lots of data that needs to be queried. However, there is significant overhead to set up and maintaining Hive (and Hadoop in general). Small companies and individuals should consider other means of storing data, such as SQL.
Camilo Palacios | TrustRadius Reviewer
Score 8 out of 10
Vetted Review
Verified User
Review Source
Software work execution is on a large scale, it is good to use for new projects or organizational changes, data lineage mapping has always been dubious but this one has had good results. You can store and synchronize data from different departments, the storage process can be manual but it is best automated.
Omkar Marne | TrustRadius Reviewer
Score 6 out of 10
Vetted Review
Verified User
Review Source
Apache Hive is best for ETL ( Extract Transform Load ) purposes. It gives its best performance when integrated with the Hadoop file distributed system. Its also very good for performing mathematical operations and when the data is organized and structured. It can handle large sizes of data ( petabytes) but requires a lot of in-memory in the system. It supports both unstructured and structured data nut best with structured data.
Pablo Gonzalez | TrustRadius Reviewer
Score 8 out of 10
Vetted Review
Verified User
Review Source
In addition to the fact that the information is quickly accessible through the established security protocols, it has not helped us as users to maintain a fairly comfortable data processing flow, it is more profitable to process the data in batches, we have been able to unify data from different sources
Score 8 out of 10
Vetted Review
Verified User
Review Source
If you have workforce who are knowing SQL and you have a need to explore large-scale data and get insights from it then Apache Hive is perfect for you. If you have experienced people who have worked on big data earlier then using Splunk is better. For starting the journey in data-driven decisions and data analytics it is better to use Apache Hive first.
Score 9 out of 10
Vetted Review
Verified User
Review Source
Apache Hive is a data warehouse/ ETL solution that is being used for processing big data for analytics and visualizations. Apache Hive has great architecture that makes it very well suited for organizations.
The Metastore, is used for storing metadata for each table and its schema. The Driver operates as a controller for executions of the statements. Like other components such as Optimizer and CLI, Thrift Server are some components that enable the processing of big data transformation.

akshay kashyap | TrustRadius Reviewer
Score 9 out of 10
Vetted Review
Verified User
Review Source
You can use Apache Hive to query over a large data warehouse which updates, append records on either batch or in real time. Apache queries can give you output in the desired format that you can use as any reporting tool such as Tableau, directly using Python.
Score 9 out of 10
Vetted Review
Verified User
Review Source
Apache Hive fits perfectly if scalability, performance and fault-tolerance are essential for your data warehousing needs. If you are required to process batch jobs Apache Hive will keep your customers happy. On the other hand, if you are working with web logs data and append-only flat-file type of data, then there are better solutions on the market.
September 23, 2020

Big Data the SQL way

Score 8 out of 10
Vetted Review
Verified User
Review Source
Apache Hive is very well suited for those who are very familiar to SQL query syntax. Due to its easy to use syntax, it can really help in scenarios where a conventional database cannot be used for analysis of big datasets.

On the other hand, it's definitely slower than some other alternatives such as spark. Also, it's not recommended to use it in processing small datasets. Pandas and other normal data loading libraries can be useful to deal with small datasets.
Kristjan Gannon | TrustRadius Reviewer
Score 7 out of 10
Vetted Review
Verified User
Review Source
Apache Hive is useful for regularly reporting and analyzing data. In terms of ad-hoc analysis and debugging, the cycles can be quite long for querying, feedback, debugging queries, etc.
Score 8 out of 10
Vetted Review
Verified User
Review Source
Hive is suitable for big data analysis tasks on top of the historical data storage but is not quite suitable for any real-time data (if that is the case, Casandra should be considered). And as it is not real SQL, for a read-only operation and in-fly aggregation, it is very good, however, if data modification and transaction are needed, it is not suitable.
Ananth Gouri | TrustRadius Reviewer
Score 9 out of 10
Vetted Review
Verified User
Review Source
I would definitely recommend Apache Hive if sought by a colleague. Especially for people who are working at academic institutions, they can demonstrate programs like word count, tab count, space count, new lines count, and other related programs - with a basic setup of a HiveQL.

The only underlying problem could be that the Apache Hive is designed to run on the Apache Hadoop ecosystem. People who are not comfortable using a Linux tree structure based File System or even people who are not likely to use a Linux OS might not like to use Hive.
Nicolas Hubert | TrustRadius Reviewer
Score 9 out of 10
Vetted Review
Verified User
Review Source
Apache Hive acts as a hub for information to be stored and smoothly readable + analyzed by BI analysts in order to make wise and data-driven decisions. Users can read, write and manage data, too. This only requires some SQL intermediary knowledge, and we all know learning SQL is quite easy. I do not think of any scenario where Apache Hive would not be appropriate.
Score 7 out of 10
Vetted Review
Verified User
Review Source
Apache Hive is well suited for organizations looking for an initial tool to begin their process of managing their data warehouse as it is open-source and relatively easy to set up. This works well with some legacy systems and many consoles support this. While Hive used to be quite revolutionary, it has fallen behind many other tools that are more performant or specialized for managing DBs, writing queries, and partitioning tables.
Score 9 out of 10
Vetted Review
Verified User
Review Source
This is best suited for data analysts and scientists, it's not a programmers tool. You may still need an RDBMS to read data from as updates and deletes can get a bit more complicated, you can run batch jobs, this will have to be facilitated by additional tools.
Its good for fast query processing, for storing large amounts of data.
Tejaswar Rao | TrustRadius Reviewer
Score 9 out of 10
Vetted Review
Verified User
Review Source
  1. To query on large sets of data
  2. Faster access compared to traditional Databases
  3. OLAP projects
  4. Data Warehousing project
  5. To get insights from GigaByte's or TeraByte's of data
  6. Rule based projects and also to identify the patterns in data
  7. For applying transformations on large sets of data
  8. Faster response time than traditional databases
  9. Also able to get connected with hadoop components
  10. For complex analytical and different types of data formats