As sweet as Honey - Apache Hive
Overall Satisfaction with Apache Hive
Apache Hive is used for data processing and analysis in the company that I am working for. Apache Hive is being used by the IT department and the results it produces are shared across the whole organization. Performing operations on terabytes of data has become easy without worrying much about the complexity involved. Similarity with SQL related tools has increased the difficulty in looking for employees with big-data skills.
Pros
- Apache Hive works extremely well with large data sets. Analysis over a large data set (Example: 1PB of data) is made easy with hive.
- User-defined functions gives flexibility to users to define operations that are used frequently as functions.
- String functions that are available in hive has been extensively used for analysis.
Cons
- Joins (especially left join and right join) are very complex, space consuming and time consuming. Improvement in this area would be of great help!
- Having more descriptive errors help in resolving issues that arise when configuring and running Apache Hive.
- Apache Hive has positive impact on our overall business objectives. Sharing data has never been very easy while dealing with large data sets.
- Impala
I used Impala when it was still in the bud stage. Apache hive has been very convenient from the very beginning.
Comments
Please log in to join the conversation