Apache Pig vs. IBM Db2 Big SQL

Apache Pig

Apache Pig

22 Reviews and Ratings

IBM Db2 Big SQL

IBM Db2 Big SQL

18 Reviews and Ratings

Overview
Product	Rating	Most Used By	Product Summary	Starting Price
Apache Pig	Score 8.4 out of 10	N/A	Apache Pig is a programming tool for creating MapReduce programs used in Hadoop.	N/A
Db2 Big SQL	Score 9.0 out of 10	N/A	IBM offers Db2 Big SQL, an enterprise grade hybrid ANSI-compliant SQL on Hadoop engine, delivering massively parallel processing (MPP) and advanced data query. Big SQL offers a single database connection or query for disparate sources such as HDFS, RDMS, NoSQL databases, object stores and WebHDFS.	N/A

Pricing

Apache Pig

IBM Db2 Big SQL

Editions & Modules

No answers on this topic

No answers on this topic

Offerings

Pricing Offerings
Apache Pig	Db2 Big SQL
Free Trial
No	No
Free/Freemium Version
Yes	No
Premium Consulting/Integration Services
No	No

Entry-level Setup Fee

No setup fee

No setup fee

Additional Details

—

—

More Pricing Information

Community Pulse
	Apache Pig	IBM Db2 Big SQL
Considered Both Products	Apache Pig Verified User Anonymous Chose Apache Pig Apache Hadoop, Azure Data Lake Storage, Amazon EMR (Elastic MapReduce), Presto (formerly Presto DB), Confluent Platform and Alteryx Incentivized Helpful? Sourov K Chowdhury Database Software Engineer Chose Apache Pig It takes me less time to write a Pig script than get a Spark program running for batch ETL workloads. Compared to Spark, Pig has a steeper learning curve because it employs a proprietary programming language. In one script and one fine, it can handle both Map Reduce and Hadoop. … Incentivized Helpful? Verified User Anonymous Chose Apache Pig It can accommodate Map Reduce in a single script and a single fine. IT has very much documentation present for easy learning. SQL like queries makes it easy to understand Incentivized Helpful? Verified User Anonymous Chose Apache Pig Apache Pig might help to start things faster at first and it was one of the best tool years back but it lacks important features that are needed in the data engineering world right now. Pig also has a steeper learning curve since it uses a proprietary language compared to Spark … Incentivized Helpful? Jordan Moore Software Consultant Chose Apache Pig Pig is more focused on scripting in its own PigLatin language rather than integrate into another language like Java/Scala/Python/SQL. However, for batch ETL workloads, I find that I can write a Pig script quicker than setting up and deploying a Spark program, for example. Incentivized Helpful? Subhadipto Poddar Research Assistant Chose Apache Pig Apache Pig is picked up quickly and can be implemented with very little coding skills. Also the other languages require exact matching of versions during installations which made them somewhat less user-friendly. Also most of the tasks that are done in map reduce can be done … Incentivized Helpful? Kartik Chavan Data Analyst Chose Apache Pig I use both Apache Pig and its alternatives like Apache Spark & Apache Hive. Apache Pig was one of the best options in Big Data's initial stages. But now alternatives have taken over the market, rendering Apache Pig behind in the competition. But it is still a better alternative … Incentivized Helpful? Verified User Anonymous Chose Apache Pig Early on Apache Pig was a great tool for easily writing distributed processing applications without needing to write a complete Java MapReduce job from scratch, but as time as moved on there now better alternatives to get results faster for both ad-hoc analysis and for … Incentivized Helpful? Verified User Anonymous Chose Apache Pig - Provided better ways for optimized hadoop jobs than Hive but not anymore. - Spark DSL is much more advanced and compute times are significantly less. Incentivized Helpful?	Db2 Big SQL Gene Baker Vice President, Chief Architect, Development Manager and Software Engineer Chose Db2 Big SQL MS SQL Server was ruled out given we didn't feel we could collapse environments. We thought of MS-SQL as more of a one for one replacement for Sybase ASE, i.e., server for server. SAP HANA was evaluated and given a big thumbs up but was rejected because the SQL would have … Incentivized Helpful?

Best Alternatives
	Apache Pig	IBM Db2 Big SQL
Small Businesses	No answers on this topic	No answers on this topic
Medium-sized Companies	Cloudera Manager Score 9.9 out of 10	Cloudera Manager Score 9.9 out of 10
Enterprises	IBM Analytics Engine Score 8.6 out of 10	IBM Analytics Engine Score 8.6 out of 10
All Alternatives	View all alternatives	View all alternatives

User Ratings
	Apache Pig	IBM Db2 Big SQL
Likelihood to Recommend	8.2 (0 ratings)	9.0 (0 ratings)
Usability	10.0 (0 ratings)	8.0 (0 ratings)
Support Rating	6.0 (0 ratings)	8.8 (0 ratings)

User Testimonials
	Apache Pig	IBM Db2 Big SQL
Likelihood to Recommend	Apache Pig is best suited for ETL-based data processes. It is good in performance in handling and analyzing a large amount of data. it gives faster results than any other similar tool. It is easy to implement and any user with some initial training or some prior SQL knowledge can work on it. Apache Pig is proud to have a large community base globally. Incentivized Verified User Anonymous Read full review	IBM Db2 is a legacy database and is primarily great for supporting certain legacy applications. It's simply not as competitive as many solutions on the market now. Incentivized JS John Spies Database Administrator Read full review
Pros	Iterative Development - you can write aliases/variables, which are not immediately executed and these are stored in a DAG, which is only evaluated upon dumping or storing another alias. Fast execution - Works with MapReduce, Tez, or Spark execution frameworks to provide fast run times at large scales. Local and remote interoperability - Scripts that depend on testing a small dataset locally before moving to the full thing can simply be done with "pig -x local." Incentivized Jordan Moore Software Consultant Read full review	data storage data manipulation data definitions data reliability Incentivized JS John Spies Database Administrator Read full review
Cons	May not fit every need and a SQL-like abstraction may be more effective for some tasks (look at Spark-SQL, Hive, or even an actual DBMS) All Pig jobs are written in a Domain Specific Language so not a lot of transferable knowledge Writing your own User Defined Functions (UDFS) is a nice feature but can be painful to implement in practice Incentivized Verified User Anonymous Read full review	Cloud readiness. Ease of implementation. Incentivized Gene Baker Vice President, Chief Architect, Development Manager and Software Engineer Read full review
Usability	It is quick, fast and easy to implement Apache Pig which makes is quite popular to be used. Incentivized Subhadipto Poddar Research Assistant Read full review	IBM DB2 is a solid service but hasn't seen much innovation over the past decade. It gets the job done and supports our IT operations across digital so it is fair. Incentivized JS John Spies Database Administrator Read full review
Support Rating	The documentation is adequate. I'm not sure how large of an external community there is for support. Incentivized Jordan Moore Software Consultant Read full review	IBM did a good job of supporting us during our evaluation and proof of concept. They were able to provide all necessary guidance, answer questions, help us architect it, etc. We were pleased with the support provided by the vendor. I will caveat and say this support was all before the sale, however, we have a ton of IBM products and they provide the same high level of support for all of them. I didn't see this being any different. I give IBM support two thumbs up! Incentivized Gene Baker Vice President, Chief Architect, Development Manager and Software Engineer Read full review
Alternatives Considered	It takes me less time to write a Pig script than get a Spark program running for batch ETL workloads. Compared to Spark, Pig has a steeper learning curve because it employs a proprietary programming language. In one script and one fine, it can handle both Map Reduce and Hadoop. It has a large amount of documentation available to make learning more convenient. Incentivized Sourov K Chowdhury Database Software Engineer Read full review	MS SQL Server was ruled out given we didn't feel we could collapse environments. We thought of MS-SQL as more of a one for one replacement for Sybase ASE, i.e., server for server. SAP HANA was evaluated and given a big thumbs up but was rejected because the SQL would have to be rewritten at the time (now they have an accelerator so you don't have to). Also, there was a very low adoption rate within the enterprise. IBM DB2 Big SQL was not selected even though technically it achieved high scores, because we could not find readily available talent and low adoption rate within the enterprise (basically no adoption at the time). We ended up selecting Exadata because of the high adoption rate within the enterprise even though technically HANA and Big SQL were superior in our evaluations. Incentivized Gene Baker Vice President, Chief Architect, Development Manager and Software Engineer Read full review
Return on Investment	Return on Investments are significant considering what it can do with traditional analysis techniques. But, other alternatives like Apache Spark, Hive being more efficient, it is hard to stick to Apache Pig. It can handle large datasets pretty easily compared to SQL. But, again, alternatives are more efficient. While working on unstructured, decentralized dataset, Pig is highly beneficial, as it is not a complete deviation from SQL, but it does not take you in complexity MapReduce as well. Incentivized Kartik Chavan Data Analyst Read full review	Performance gains were positive. Finding resources on the street with knowledge at the time was hard. Incentivized Gene Baker Vice President, Chief Architect, Development Manager and Software Engineer Read full review
ScreenShots