Best query platform for ETL.
April 08, 2022

Best query platform for ETL.

Omkar Marne | TrustRadius Reviewer
Score 6 out of 10
Vetted Review
Verified User

Overall Satisfaction with Apache Hive

I used Apache Hive on top of Hadoop for filtering and cleaning data using SQL. It was the part of the project which I was working on. Apache Hive gives SQL-like a platform where we can fire SQL queries. Apache Hive was a perfect choice for cleaning data as we were using Apache Hadoop and both are Apache products.
  • Filtering data
  • cleaning data
  • SQL like interface
  • Integrates with Hadoop
  • Uses lot of lot of memory
  • Not compatible with other databases like postgres, MySql
  • Limited support
  • Slow as compare o other interfaces
  • Integrates with Hadoop
  • Large size data analysis
  • query optimization
  • fast results
  • reduced time complexity
  • code debugging is easy
Apache Hadoop is built on top of the Hadoop File system so it gives its best when integrated with Hadoop. Data analysis and query optimization become very easy when used with Hadoop to perform Extract transform load operations. As Hadoop is a big data system and handles large data sizes, Apache hive can query large data with less time complexity.

Do you think Apache Hive delivers good value for the price?

Yes

Are you happy with Apache Hive's feature set?

Yes

Did Apache Hive live up to sales and marketing promises?

Yes

Did implementation of Apache Hive go as expected?

Yes

Would you buy Apache Hive again?

Yes

Apache Hive is best for ETL ( Extract Transform Load ) purposes. It gives its best performance when integrated with the Hadoop file distributed system. Its also very good for performing mathematical operations and when the data is organized and structured. It can handle large sizes of data ( petabytes) but requires a lot of in-memory in the system. It supports both unstructured and structured data nut best with structured data.