Apache Hive vs. Db2

Overview
ProductRatingMost Used ByProduct SummaryStarting Price
Apache Hive
Score 8.2 out of 10
N/A
Apache Hive is database/data warehouse software that supports data querying and analysis of large datasets stored in the Hadoop distributed file system (HDFS) and other compatible systems, and is distributed under an open source license.N/A
Db2
Score 8.7 out of 10
N/A
DB2 is a family of relational database software solutions offered by IBM. It includes standard Db2 and Db2 Warehouse editions, either deployable on-cloud, or on-premise.
$0
Pricing
Apache HiveDb2
Editions & Modules
No answers on this topic
Db2 on Cloud Lite
$0
Db2 on Cloud Standard
$99
per month
Db2 Warehouse on Cloud Flex One
$898
per month
Db2 on Cloud Enterprise
$946
per month
Db2 Warehouse on Cloud Flex for AWS
2,957
per month
Db2 Warehouse on Cloud Flex
$3,451
per month
Db2 Warehouse on Cloud Flex Performance
13,651
per month
Db2 Warehouse on Cloud Flex Performance for AWS
13,651
per month
Db2 Standard Edition
Contact us
Db2 Advanced Edition
Contact us
Offerings
Pricing Offerings
Apache HiveDb2
Free Trial
NoYes
Free/Freemium Version
NoYes
Premium Consulting/Integration Services
NoYes
Entry-level Setup FeeNo setup feeOptional
Additional Details
More Pricing Information
Community Pulse
Apache HiveDb2
Top Pros
Top Cons
Best Alternatives
Apache HiveDb2
Small Businesses
Google BigQuery
Google BigQuery
Score 8.6 out of 10
SingleStore
SingleStore
Score 9.8 out of 10
Medium-sized Companies
Cloudera Enterprise Data Hub
Cloudera Enterprise Data Hub
Score 9.0 out of 10
SingleStore
SingleStore
Score 9.8 out of 10
Enterprises
Oracle Exadata
Oracle Exadata
Score 8.2 out of 10
SingleStore
SingleStore
Score 9.8 out of 10
All AlternativesView all alternativesView all alternatives
User Ratings
Apache HiveDb2
Likelihood to Recommend
8.0
(35 ratings)
8.6
(74 ratings)
Likelihood to Renew
10.0
(1 ratings)
8.0
(12 ratings)
Usability
8.5
(7 ratings)
8.7
(7 ratings)
Availability
-
(0 ratings)
8.7
(51 ratings)
Performance
-
(0 ratings)
9.1
(11 ratings)
Support Rating
7.0
(6 ratings)
6.0
(6 ratings)
In-Person Training
-
(0 ratings)
8.2
(1 ratings)
Implementation Rating
-
(0 ratings)
9.0
(2 ratings)
Configurability
-
(0 ratings)
9.1
(1 ratings)
Ease of integration
-
(0 ratings)
8.2
(1 ratings)
Product Scalability
-
(0 ratings)
8.7
(51 ratings)
Vendor post-sale
-
(0 ratings)
8.2
(1 ratings)
Vendor pre-sale
-
(0 ratings)
8.2
(1 ratings)
User Testimonials
Apache HiveDb2
Likelihood to Recommend
Apache
Software work execution is on a large scale, it is good to use for new projects or organizational changes, data lineage mapping has always been dubious but this one has had good results. You can store and synchronize data from different departments, the storage process can be manual but it is best automated.
Read full review
IBM
I could think of a couple but the obvious is in Fintech and Retail, because of the amount of transactional and event level data for global operations. It is imperative to have a solution that can handle such large scale date, in real-time and batch delivery for inbound and outbound delivery, and ultimately ensuring that workload management is supported in some cases for around the clock SLAs.
Read full review
Pros
Apache
  • Apache Hive allows use to write expressive solutions to complex problems thanks to its SQL-like syntax.
  • Relatively easy to set up and start using.
  • Very little ramp-up to start using the actual product, documentation is very thorough, there is an active community, and the code base is constantly being improved.
Read full review
IBM
  • DB2 maintains itself very well. The Task Scheduler component of DB2 allows for statistics gathering and reorganization of indexes and tables without user interaction or without specific knowledge of cron or Windows Task Scheduler / Scheduled jobs.
  • Its use of ASYNC, NEARSYNC, and SYNC HADR (High Availability Disaster Recovery ) models gives you a range of options for maintaining a very high uptime ratio. Failover from PRIMARY to SECONDARY becomes very easy with just a single command or windowed mouse click.
  • Task Scheduler ( DB2 9.7 and earlier ) allows for jobs to be run within other jobs, and exit and error codes can define what other jobs are run. This allows for ease of maintenance without third party softwares.
  • Tablespace usage and automatic storage help keep your data segmented while at rest, making partitioning easier.
  • Ability to run commands via CLI (Command Line Interface) or via Control Center / Data Studio ( DB2 10.x+) makes administration a breeze.
Read full review
Cons
Apache
  • Some queries, particularly complex joins, are still quite slow and can take hours
  • Previous jobs and queries are not stored sometimes
  • Switching to Impala can sometimes be time-consuming (i.e. the system hangs, or is slow to respond).
  • Sometimes, directories and tables don't load properly which causes confusion
Read full review
IBM
  • The relational model requires a rigid schema that does not necessarily fit with some types of modern development.
  • Proprietary database, requires a lot of Hardware for its good performance and its costs are high.
  • As data grows in production environment, it becomes slow.
Read full review
Likelihood to Renew
Apache
Since I do not know the second data warehouse solution that integrate with HDFS as well as Hive.
Read full review
IBM
The DB2 database is a solid option for our school. We have been on this journey now for 3-4 years so we are still adapting to what it can do. We will renew our use of DB2 because we don’t see. Major need to change. Also, changing a main database in a school environment is a major project, so we’ll avoid that if possible.
Read full review
Usability
Apache
Hive is a very good big data analysis and ad-hoc query platform, which supports scaling also. The BI processes can be easily integrated with Hadoop via the Hive. It can deal with a much larger data set that traditional RDBMS can not. It is a "must-have" component of the big data domain.
Read full review
IBM
You have to be well versed in using the technology, not only from a GUI interface but from a command line interface to successfully use this software to its fullest.
Read full review
Reliability and Availability
Apache
No answers on this topic
IBM
I have never had DB2 go down unexpectedly. It just works solidly every day. When I look at the logs, sometimes DB2 has figured out there was a need to build an index. Instead of waiting for me to do it, the database automatically created the index for me. At my current company, we have had zero issues for the past 8 years. We have upgrade the server 3 times and upgraded the OS each time and the only thing we saw was that DB2 got better and faster. It is simply amazing.
Read full review
Performance
Apache
No answers on this topic
IBM
The performances are exceptional if you take care to maintain the database. It is a very powerful tool and at the same time very easy to use. In our installation, we expect a DB machine on the mainframe with access to the database through ODBC connectors directly from branch servers, with fabulous end users experience.
Read full review
Support Rating
Apache
Apache Hive is a FOSS project and its open source. We need not definitely comment on anything about the support of open source and its developer community. But, it has got tremendous developer support, awesome documentation. I would justify the fact that much support can be gathered from the community backup.
Read full review
IBM
Easily the best product support team. :) Whenever we have questions, they have answered those in a timely manner and we like how they go above and beyond to help.
Read full review
In-Person Training
Apache
No answers on this topic
IBM
the material was very clear and all subjects have been handled
Read full review
Implementation Rating
Apache
No answers on this topic
IBM
db2 work well with the application, also the replication tool can keep it up
Read full review
Alternatives Considered
Apache
Besides Hive, I have used Google BigQuery, which is costly but have very high computation speed. Amazon Redshift is the another product, I used in my recent organisation. Both Redshift and BigQuery are managed solution whereas Hive needs to be managed
Read full review
IBM
DB2 was more scalable and easily configurable than other products we evaluated and short listed in terms of functionality and pricing. IBM also had a good demo on premise and provided us a sandbox experience to test out and play with the product and DB2 at that time came out better than other similar products.
Read full review
Scalability
Apache
No answers on this topic
IBM
By
using DB2 only to support my IzPCA activities, my knowledge here
is somewhat limited.

Anyway,
from what I was able to understand, DB2 is extremely scallable.

Maybe the information below could serve as an example of scalability.
Customer have an huge mainframe environment, 13x z15 CECs, around
80 LPARs, and maybe more than 50 Sysplexes (I am not totally sure about this
last figure...)

Today
we have 7 IzPCA
databases, each one in a distinct Syplex.

Plans
are underway to have, at the end, an small LPAR, with only one DB2 sub-system,
and with only one database, then transmit the data from a lot of other LPARs,
and then process all the data in this only one database.



The
IzPCA collect process (read the data received, manipulate it, and insert rows
in the tables) today is a huge process, demanding many elapsed
hours, and lots of CPU.

Almost
100% of the tables are PBR type, insert jobs run in parallel, but in 4 of the 7
database, it is a really a huge and long process.



Combining
the INSERTs loads from the 7 databases in only one will be impossible.......,,,,



But,
IzPCA recently introduced a new feature, called "Continuous
Collector"
.
By
using that feature, small amounts of data will be transmited to the central
LPAR at every 5 minutes (or even less), processed immediately,in
a short period of time, and with small use of CPU,
instead of one or two transmissions by day, of very large amounts of data and
the corresponding collect jobs occurring only once or twice a day, with long
elapsed times, and huge comsumption of CPU



I
suspect the total CPU seconds consumed will be more or less the same in
both cases, but in the new method it will occur in small bursts
many times a day!!
Read full review
Return on Investment
Apache
  • Apache hive is secured and scalable solution that helps in increasing the overall organization productivity.
  • Apache hive can handle and process large amount of data in a sufficient time manner.
  • It simplifies writing SQL queries, hence helping the organization as most companies use SQL for all query jobs.
Read full review
IBM
  • Fast response time by processing optimization and cost reduction by reduced CPU utilization. Nowadays, good performance is a necessary condition for the survival of a company and its sustained growth
  • SQL enhancements are targeted to improve performance, simplify current and new applications, and reduce the development cycle time to market.
  • A CPU reduction at peak times can immediately reduce our TCO by reducing software costs related to CPU utilization.
  • Impressive reductions in memory requirements, which used to limit the concurrent database activity
  • Out-of-the-box savings without changing the database or application
Read full review
ScreenShots