Apache Hive Review
September 13, 2017

Apache Hive Review

Sameer Gupta | TrustRadius Reviewer
Score 8 out of 10
Vetted Review
Verified User

Overall Satisfaction with Apache Hive

Hive is currently being used across the entire analytics organization at SurveyMonkey. The business problem that we solve through it is, accessing/storing large data sets(typically logs), in a scalable and accessible place.
  • SQL like query engine, allows easy ramp up from a standard RDBMS
  • Scalability is great
  • If properly configured the data retreival is fantastic
  • The way we currently have it implemented is quite slow, but I believe that's more of our implementation
  • Joins tend to be slow
  • I think productivity has increased for us as we're now able to store data going far back as we want
  • Allows us to perform analytics that we wouldn't be able to do otherwise. For example customer life cycle mapping is possible through this
  • ROI in terms of ramp up time for new employees who don't have a big data background. Since HQL is available, which like sql, analyst that have little to no big data exposure can quickly get upto speed and start working
I wasn't part of the evaluation process for Apache Hive. This was already implemented when I joined the company. I have worked with other big data plaftforms and I personally thinks most of them are quite comporable to one another. It really depends on what the company is going for. For exampel Google Cloud makes a ton of sense for a user if they developed their application on Google App Engine.
I think Apache hive is great for a company just stepping into the big data realm. I think the fact that it's open source allows for a variety of tools to be integrated. The fact that it has HiveQL makes for a great transition from a standard RDMS to a big data tool. This can be very nice in terms of cost savings as the ramp up time for an analyst will be quite low.