Item: Apache Hive
Rating: 8
Author: Verified User

Use Cases and Deployment Scope

On-premises large data processing is handled by Apache Hive, which is running on Cloud ERA Servers. In order to use Apache Hive, you must have a distributed system that is query efficient and can perform queries quicker with parallel execution. Metrics like user information and purchase history are stored in HDFS and then accessed using queries built on top of Hive using Apache Hive.

Pros and Cons

Reduce-based query language with a simple query language.
Parallelism across a distributed system is provided.
All cloud platforms have access to a tabular format and interfaces.

Due to the shuffled data, complex joins may take a long time to complete.
Execution is dependent on external storage and memory.

Most Important Features

Maintaining inquiries on the premise is simple.
Hive may be rapidly learned by anyone who is already familiar with SQL.
Parallel and distributed processing.

Return on Investment

Improved performance compared to a database management system.
HDFS-based distributed processing greatly improves scalability.
Improved query performance compared to oracle databases.

Alternatives Considered

Apache Spark

To query a huge, distributed dataset, Apache Hive was built by Facebook. Unlike Apache Hive, Apache Spark is an in-memory computation engine, which is why it is significantly quicker than Apache Hive at querying large amounts of data. In contrast to Apache HBase, Apache Hive is better suited for dealing with structured data stored on HDFS.

Key Insights

Do you think Apache Hive delivers good value for the price?

Yes

Are you happy with Apache Hive's feature set?

Yes

Did Apache Hive live up to sales and marketing promises?

Yes

Did implementation of Apache Hive go as expected?

Yes

Would you buy Apache Hive again?

Yes

Other Software Used

Apache Hadoop, Salesforce Marketing Cloud Interaction Studio (formerly Evergage + MyBuys)

Likelihood to Recommend

Data warehouses that update and append records in batches or real time can be queried using Apache Hive. Tableau and other reporting tools may be used straight from Python searches on Apache data sets. Structured data and tables may be accessed using SQL-like syntax. Using a hive, you may build tables at various levels of the Data Lake. Transactional databases are not the best fit.

With Apache Hive, you can enter the world of Big Data

Overall Satisfaction with Apache Hive