Hive, a very powerful open source data warehouse solution.
December 22, 2014

Hive, a very powerful open source data warehouse solution.

Yinghua Hu | TrustRadius Reviewer
Score 10 out of 10
Vetted Review
Verified User

Software Version

14

Overall Satisfaction with Hive

Hive is used by data team to store the largest datasets of the company. Data is partitioned in Hive and can be queried by Impala.
  • Partition to increase query efficiency.
  • Serde to support different data storage format.
  • Integrate well with Impala and data can be queried by Impala.
  • Support of parquet compression format
  • Speed is slower compared to Impala since it uses map reduce
  • Hive, combined with Impala increases the efficiency that our analyst queries the data.
Impala queries faster than Hive on the same data but it highly depends on Hive. Also Impala does not support Serde allowing to query different data format (JSON, XML), but Hive does.
Since I do not know the second data warehouse solution that integrate with HDFS as well as Hive.
Hive is a data warehouse and it does not allow for updates and deletions. If data needs to be updated frequently, it might not be the best storage solution for that purpose.