Apache Hive vs. Apache Kafka

Apache Hive

Apache Hive

95 Reviews and Ratings

Apache Kafka

Apache Kafka

152 Reviews and Ratings

Overview
Product	Rating	Most Used By	Product Summary	Starting Price
Apache Hive	Score 8.0 out of 10	N/A	Apache Hive is database/data warehouse software that supports data querying and analysis of large datasets stored in the Hadoop distributed file system (HDFS) and other compatible systems, and is distributed under an open source license.	N/A
Apache Kafka	Score 8.8 out of 10	N/A	Apache Kafka is an open-source stream processing platform developed by the Apache Software Foundation written in Scala and Java. The Kafka event streaming platform is used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications.	N/A

Pricing

Apache Hive

Apache Kafka

Editions & Modules

No answers on this topic

No answers on this topic

Offerings

Pricing Offerings
Apache Hive	Apache Kafka
Free Trial
No	No
Free/Freemium Version
No	No
Premium Consulting/Integration Services
No	No

Entry-level Setup Fee

No setup fee

No setup fee

Additional Details

—

—

More Pricing Information

Community Pulse
	Apache Hive	Apache Kafka
Considered Both Products	Apache Hive Verified User Engineer Chose Apache Hive Apache Spark is similar in the sense that it too can be used to query and process large amounts of data through its Dataframe interface. Hive is better for short-term querying while Spark is better for persistent and long-term analysis. Another product is Impala. For our … Incentivized Helpful? Manjeet Singh Senior Manager - Engineering Chose Apache Hive Besides Hive, I have used Google BigQuery, which is costly but have very high computation speed. Amazon Redshift is the another product, I used in my recent organisation. Both Redshift and BigQuery are managed solution whereas Hive needs to be managed Incentivized Helpful? Verified User Engineer Chose Apache Hive I have used Storm for real-time processing, but that only addresses a few data points. But for a larger access to data, Hive is well suited. Incentivized Helpful?	Apache Kafka No answer on this topic

Best Alternatives
	Apache Hive	Apache Kafka
Small Businesses	Google BigQuery Score 8.7 out of 10	No answers on this topic
Medium-sized Companies	Cloudera Enterprise Data Hub Score 9.0 out of 10	IBM MQ Score 9.0 out of 10
Enterprises	Oracle Exadata Score 9.8 out of 10	IBM MQ Score 9.0 out of 10
All Alternatives	View all alternatives	View all alternatives

User Ratings
	Apache Hive	Apache Kafka
Likelihood to Recommend	8.0 (35 ratings)	8.0 (19 ratings)
Likelihood to Renew	10.0 (1 ratings)	9.0 (2 ratings)
Usability	8.5 (7 ratings)	8.0 (2 ratings)
Support Rating	7.0 (6 ratings)	8.4 (4 ratings)

User Testimonials
	Apache Hive	Apache Kafka
Likelihood to Recommend	Apache Software work execution is on a large scale, it is good to use for new projects or organizational changes, data lineage mapping has always been dubious but this one has had good results. You can store and synchronize data from different departments, the storage process can be manual but it is best automated. Incentivized Camilo Palacios Administrador informático. Read full review	Apache Apache Kafka is well-suited for most data-streaming use cases. Amazon Kinesis and Azure EventHubs, unless you have a specific use case where using those cloud PaAS for your data lakes, once set up well, Apache Kafka will take care of everything else in the background. Azure EventHubs, is good for cross-cloud use cases, and Amazon Kinesis - I have no real-world experience. But I believe it is the same. VT Victor Tay Engineer Read full review
Pros	Apache Apache Hive allows use to write expressive solutions to complex problems thanks to its SQL-like syntax. Relatively easy to set up and start using. Very little ramp-up to start using the actual product, documentation is very thorough, there is an active community, and the code base is constantly being improved. Incentivized Verified User Anonymous Read full review	Apache Really easy to configure. I've used other message brokers such as RabbitMQ and compared to them, Kafka's configurations are very easy to understand and tweak. Very scalable: easily configured to run on multiple nodes allowing for ease of parallelism (assuming your queues/topics don't have to be consumed in the exact same order the messages were delivered) Not exactly a feature, but I trust Kafka will be around for at least another decade because active development has continued to be strong and there's a lot of financial backing from Confluent and LinkedIn, and probably many other companies who are using it (which, anecdotally, is many). Incentivized Verified User Anonymous Read full review
Cons	Apache Some queries, particularly complex joins, are still quite slow and can take hours Previous jobs and queries are not stored sometimes Switching to Impala can sometimes be time-consuming (i.e. the system hangs, or is slow to respond). Sometimes, directories and tables don't load properly which causes confusion Incentivized Verified User Anonymous Read full review	Apache Sometimes it becomes difficult to monitor our Kafka deployments. We've been able to overcome it largely using AWS MSK, a managed service for Apache Kafka, but a separate monitoring dashboard would have been great. Simplify the process for local deployment of Kafka and provide a user interface to get visibility into the different topics and the messages being processed. Learning curve around creation of broker and topics could be simplified Animesh Kumar Senior Member of Technical Staff Read full review
Likelihood to Renew	Apache Since I do not know the second data warehouse solution that integrate with HDFS as well as Hive. Yinghua Hu Senior Data Scientist Read full review	Apache Kafka is quickly becoming core product of the organization, indeed it is replacing older messaging systems. No better alternatives found yet Incentivized Juan Francisco Tavira Global Technology Centre - Middleware Read full review
Usability	Apache Hive is a very good big data analysis and ad-hoc query platform, which supports scaling also. The BI processes can be easily integrated with Hadoop via the Hive. It can deal with a much larger data set that traditional RDBMS can not. It is a "must-have" component of the big data domain. Incentivized Verified User Anonymous Read full review	Apache Apache Kafka is highly recommended to develop loosely coupled, real-time processing applications. Also, Apache Kafka provides property based configuration. Producer, Consumer and broker contain their own separate property file Incentivized JV Jimesh V Shah Senior Software Engineer Read full review
Support Rating	Apache Apache Hive is a FOSS project and its open source. We need not definitely comment on anything about the support of open source and its developer community. But, it has got tremendous developer support, awesome documentation. I would justify the fact that much support can be gathered from the community backup. Incentivized Ananth Gouri Assistant Professor Read full review	Apache Support for Apache Kafka (if willing to pay) is available from Confluent that includes the same time that created Kafka at Linkedin so they know this software in and out. Moreover, Apache Kafka is well known and best practices documents and deployment scenarios are easily available for download. For example, from eBay, Linkedin, Uber, and NYTimes. Incentivized Verified User Anonymous Read full review
Alternatives Considered	Apache Besides Hive, I have used Google BigQuery, which is costly but have very high computation speed. Amazon Redshift is the another product, I used in my recent organisation. Both Redshift and BigQuery are managed solution whereas Hive needs to be managed Incentivized Manjeet Singh Senior Manager - Engineering Read full review	Apache I used other messaging/queue solutions that are a lot more basic than Confluent Kafka, as well as another solution that is no longer in the market called Xively, which was bought and "buried" by Google. In comparison, these solutions offer way fewer functionalities and respond to other needs. Incentivized Verified User Anonymous Read full review
Return on Investment	Apache Apache hive is secured and scalable solution that helps in increasing the overall organization productivity. Apache hive can handle and process large amount of data in a sufficient time manner. It simplifies writing SQL queries, hence helping the organization as most companies use SQL for all query jobs. Incentivized Verified User Anonymous Read full review	Apache Positive: Get a quick and reliable pub/sub model implemented - data across components flows easily. Positive: it's scalable so we can develop small and scale for real-world scenarios Negative: it's easy to get into a confusing situation if you are not experienced yet or something strange has happened (rare, but it does). Troubleshooting such situations can take time and effort. Incentivized Borislav Traykov DevOps Team Leader Read full review
ScreenShots