Apache Hive vs. IBM Db2 Big SQL

Apache Hive

Apache Hive

95 Reviews and Ratings

IBM Db2 Big SQL

IBM Db2 Big SQL

18 Reviews and Ratings

Overview
Product	Rating	Most Used By	Product Summary	Starting Price
Apache Hive	Score 8.0 out of 10	N/A	Apache Hive is database/data warehouse software that supports data querying and analysis of large datasets stored in the Hadoop distributed file system (HDFS) and other compatible systems, and is distributed under an open source license.	N/A
Db2 Big SQL	Score 9.0 out of 10	N/A	IBM offers Db2 Big SQL, an enterprise grade hybrid ANSI-compliant SQL on Hadoop engine, delivering massively parallel processing (MPP) and advanced data query. Big SQL offers a single database connection or query for disparate sources such as HDFS, RDMS, NoSQL databases, object stores and WebHDFS.	N/A

Pricing

Apache Hive

IBM Db2 Big SQL

Editions & Modules

No answers on this topic

No answers on this topic

Offerings

Pricing Offerings
Apache Hive	Db2 Big SQL
Free Trial
No	No
Free/Freemium Version
No	No
Premium Consulting/Integration Services
No	No

Entry-level Setup Fee

No setup fee

No setup fee

Additional Details

—

—

More Pricing Information

Community Pulse
	Apache Hive	IBM Db2 Big SQL
Considered Both Products	Apache Hive Verified User Anonymous Chose Apache Hive To query a huge, distributed dataset, Apache Hive was built by Facebook. Unlike Apache Hive, Apache Spark is an in-memory computation engine, which is why it is significantly quicker than Apache Hive at querying large amounts of data. In contrast to Apache HBase, Apache Hive is … Incentivized Helpful? Prasanna Kumar TR Developer and Site Contributor Chose Apache Hive Apache hive gave more flexible than MS SQL server. ElasticSearch was little complex. GoogleBigQuery cost more. Incentivized Helpful? Verified User Anonymous Chose Apache Hive Community support and ease of use -not deployment. It enables querying and analyzing large amounts of data stored in HDFS, on the petabyte scale. It has a query language called HQL that transforms SQL queries into MapReduce jobs that run on Hadoop, and it is wonderful for the … Incentivized Helpful? Verified User Anonymous Chose Apache Hive Apache Spark is similar in the sense that it too can be used to query and process large amounts of data through its Dataframe interface. Hive is better for short-term querying while Spark is better for persistent and long-term analysis. Another product is Impala. For our … Incentivized Helpful? Camilo Palacios Administrador informático. Chose Apache Hive We have used a simple but necessary function such as merging certain data tables, which although they may be from different areas, complement each other or are necessary, you can use metadata if what you need is to validate the origin of your information and what impact it has, … Incentivized Helpful? Omkar Marne Research Application Software Engineer Chose Apache Hive Apache Hadoop is built on top of the Hadoop File system so it gives its best when integrated with Hadoop. Data analysis and query optimization become very easy when used with Hadoop to perform Extract transform load operations. As Hadoop is a big data system and handles large … Incentivized Helpful? Pablo Gonzalez Internet Marketing Manager Chose Apache Hive We have used the system to migrate data either for new versions or because we will use another operating program, the software helps us to synchronize programs between different operating systems, a history of information can be kept constant, it can be sent to third parties … Incentivized Helpful? Verified User Anonymous Chose Apache Hive Queries are easy to write and interface is similar to SQL so learning overhead is reduced. Multi user and data type support is provided. Can be easily scaled for very large amount of analytics. It is very flexible in terms of using file formats. Incentivized Helpful? Verified User Anonymous Chose Apache Hive Snowflake, Splunk Cloud, Talend Open Studio, Azure Data Factory and Apache Spark Incentivized Helpful? Verified User Anonymous Chose Apache Hive Due to effective queries resolved time and the performance and user-friendly framework compared to other products. Incentivized Helpful? Surendranatha Reddy Chappidi Senior Data Engineer Chose Apache Hive Azure Synapse Analytics (Azure SQL Data Warehouse) and Databricks Lakehouse Platform (Unified Analytics Platform) Incentivized Helpful? akshay kashyap CONSULTANT Chose Apache Hive Apache Hive is a query language developed by Facebook to query over a large distributed dataset. Apache is a query engine that runs on top of HDFS, so it utilizes the resources of HDFS Hadoop setup, while Apache Spark is an in memory compute engine, and that's why [it is] much … Incentivized Helpful? Manjeet Singh Senior Manager - Engineering Chose Apache Hive Besides Hive, I have used Google BigQuery, which is costly but have very high computation speed. Amazon Redshift is the another product, I used in my recent organisation. Both Redshift and BigQuery are managed solution whereas Hive needs to be managed Incentivized Helpful? Verified User Anonymous Chose Apache Hive Hive and Spark have the same parent company hence they share a lot of common features. Hive follows SQL syntax while Spark has support for RDD, DataFrame API. DataFrame API supports both SQL syntax and has custom functions to perform the same functionality. Spark is faster and … Incentivized Helpful? Verified User Anonymous Chose Apache Hive Apache Hive decouples the query layer from the storage layer, it is more flexible and expandable. Incentivized Helpful? Ananth Gouri Assistant Professor Chose Apache Hive One of the major advantages of using Presto or the main reason why people use Presto (Teradata) is due to that fact it can support multiple data sources - which is lacking as in the case of Apache Hive. But still, most people who come from a Structured data-based background … Incentivized Helpful? Nicolas Hubert Machine Learning Engineer Chose Apache Hive Easy to understand, well supported by the community, good documentation. However, it is possible that SAP Business Warehouse could be a good fit, too, even maybe better. I did not have the chance to try it though. We selected Apache Hive because it was far less expensive and … Incentivized Helpful? Kartik Chavan Data Science Trainee Chose Apache Hive I considered Hive because it is the best suited option when it comes to larger data access. Besides, learning HiveQL is comparatively easy. Incentivized Helpful? Verified User Anonymous Chose Apache Hive I have used Storm for real-time processing, but that only addresses a few data points. But for a larger access to data, Hive is well suited. Incentivized Helpful? Verified User Anonymous Chose Apache Hive [We selected Apache Hive because] It's from apache and opensource. So it's free. Incentivized Helpful? Tejaswar Rao Associate Consultant Chose Apache Hive Faster response time and also can handle complex analytical queries Can able to write custom function using python and hive Able to connect using hadoop components and also using R Incentivized Helpful? Bharadwaj (Brad) Chivukula Sr.Technical Manager/Delivery Manager Chose Apache Hive For storing bulk amount of data in a tabular manner, and where there's no need need of primary key, or just in case, if redundant data is received, it will not cause a problem. For small amounts of data, it does run MR, so beware. If your intention is to use it as a … Incentivized Helpful? Sameer Gupta Senior Data Analyst Chose Apache Hive I wasn't part of the evaluation process for Apache Hive. This was already implemented when I joined the company. I have worked with other big data plaftforms and I personally thinks most of them are quite comporable to one another. It really depends on what the company is going … Incentivized Helpful? Verified User Anonymous Chose Apache Hive Hive is SQL compliant which makes it easy for the data folks compared to Pig Incentivized Helpful? Verified User Anonymous Chose Apache Hive Apache Pig is probably the most direct technology to compare to Hive and has several different use cases to Hive. If you want to simplify processing tasks that run using MapReduce then Apache Pig may be a better tool for the job. However if you are going to be running many … Incentivized Helpful?	Db2 Big SQL Gene Baker Vice President, Chief Architect, Development Manager and Software Engineer Chose Db2 Big SQL MS SQL Server was ruled out given we didn't feel we could collapse environments. We thought of MS-SQL as more of a one for one replacement for Sybase ASE, i.e., server for server. SAP HANA was evaluated and given a big thumbs up but was rejected because the SQL would have … Incentivized Helpful?

Best Alternatives
	Apache Hive	IBM Db2 Big SQL
Small Businesses	Google BigQuery Score 8.7 out of 10	No answers on this topic
Medium-sized Companies	Cloudera Enterprise Data Hub Score 9.0 out of 10	Cloudera Manager Score 9.9 out of 10
Enterprises	Oracle Exadata Score 9.8 out of 10	IBM Analytics Engine Score 8.6 out of 10
All Alternatives	View all alternatives	View all alternatives

User Ratings
	Apache Hive	IBM Db2 Big SQL
Likelihood to Recommend	8.0 (0 ratings)	9.0 (0 ratings)
Likelihood to Renew	10.0 (0 ratings)	- (0 ratings)
Usability	8.5 (0 ratings)	8.0 (0 ratings)
Support Rating	7.0 (0 ratings)	8.8 (0 ratings)

User Testimonials
	Apache Hive	IBM Db2 Big SQL
Likelihood to Recommend	Apache Hive shines for ad-hoc analysis and plugging into BI tools. Its SQL-like syntax allows for ease of use not for only for engineers but also for data analysts. Through our experience, there are probably more desirable tools to use if you are planning on integrating Hive into your processing pipeline. Incentivized Verified User Anonymous Read full review	IBM Db2 is a legacy database and is primarily great for supporting certain legacy applications. It's simply not as competitive as many solutions on the market now. Incentivized JS John Spies Database Administrator Read full review
Pros	Hive syntax is almost like SQL, so for someone already familiar with SQL it takes almost no effort to pick up Hive. To be able to run map reduce jobs using json parsing and generate dynamic partitions in parquet file format. Simplifies your experience with Hadoop especially for non-technical/coding partners. Incentivized Bharadwaj (Brad) Chivukula Sr.Technical Manager/Delivery Manager Read full review	data storage data manipulation data definitions data reliability Incentivized JS John Spies Database Administrator Read full review
Cons	Use Hive for analytical work loads. Write once and read many scenarios. Do not prefer updates and deletes. Behind scenes Hive creates map reduce jobs. Hive performance is slow compared to Apache Spark. Map reduce writes the intermediate outputs to dial whereas Spark operates in in-memory and uses DAG. Incentivized Verified User Anonymous Read full review	Cloud readiness. Ease of implementation. Incentivized Gene Baker Vice President, Chief Architect, Development Manager and Software Engineer Read full review
Likelihood to Renew	Since I do not know the second data warehouse solution that integrate with HDFS as well as Hive. Yinghua Hu Senior Data Scientist Read full review	No answers on this topic
Usability	Hive is a very good big data analysis and ad-hoc query platform, which supports scaling also. The BI processes can be easily integrated with Hadoop via the Hive. It can deal with a much larger data set that traditional RDBMS can not. It is a "must-have" component of the big data domain. Incentivized Verified User Anonymous Read full review	IBM DB2 is a solid service but hasn't seen much innovation over the past decade. It gets the job done and supports our IT operations across digital so it is fair. Incentivized JS John Spies Database Administrator Read full review
Support Rating	Apache Hive is a FOSS project and its open source. We need not definitely comment on anything about the support of open source and its developer community. But, it has got tremendous developer support, awesome documentation. I would justify the fact that much support can be gathered from the community backup. Incentivized Ananth Gouri Assistant Professor Read full review	IBM did a good job of supporting us during our evaluation and proof of concept. They were able to provide all necessary guidance, answer questions, help us architect it, etc. We were pleased with the support provided by the vendor. I will caveat and say this support was all before the sale, however, we have a ton of IBM products and they provide the same high level of support for all of them. I didn't see this being any different. I give IBM support two thumbs up! Incentivized Gene Baker Vice President, Chief Architect, Development Manager and Software Engineer Read full review
Alternatives Considered	We have used a simple but necessary function such as merging certain data tables, which although they may be from different areas, complement each other or are necessary, you can use metadata if what you need is to validate the origin of your information and what impact it has, is also feasible. Incentivized Camilo Palacios Administrador informático. Read full review	MS SQL Server was ruled out given we didn't feel we could collapse environments. We thought of MS-SQL as more of a one for one replacement for Sybase ASE, i.e., server for server. SAP HANA was evaluated and given a big thumbs up but was rejected because the SQL would have to be rewritten at the time (now they have an accelerator so you don't have to). Also, there was a very low adoption rate within the enterprise. IBM DB2 Big SQL was not selected even though technically it achieved high scores, because we could not find readily available talent and low adoption rate within the enterprise (basically no adoption at the time). We ended up selecting Exadata because of the high adoption rate within the enterprise even though technically HANA and Big SQL were superior in our evaluations. Incentivized Gene Baker Vice President, Chief Architect, Development Manager and Software Engineer Read full review
Return on Investment	Good ROI for being able to access data easily across the network, we have large amounts of data and this is a good system to access it Good ROI for being easy to learn how to use for new employees, not much time spent which saves costs Good ROI for being able to integrate with Spark and other applications, hence data can be analyzed through programs Incentivized Verified User Anonymous Read full review	Performance gains were positive. Finding resources on the street with knowledge at the time was hard. Incentivized Gene Baker Vice President, Chief Architect, Development Manager and Software Engineer Read full review
ScreenShots