Apache Pig

Apache Pig

About TrustRadius Scoring
Score 8.0 out of 100
Apache Pig

Overview

Recent Reviews

Apache Pig

7 out of 10
April 07, 2022
We mainly use Apache Pig for its capabilities that allows us to easily create data pipelines. Also it comes with its native language Pig …
Continue reading

Apache Pig - lot to improve

7 out of 10
April 28, 2021
Apache Pig and its query language (Pig Latin) allowed us to create data pipelines with ease and heavily used by our teams. The language …
Continue reading

Reviewer Pros & Cons

View all pros & cons

Video Reviews

Leaving a video review helps other professionals like you evaluate products. Be the first one in your network to record a review of Apache Pig, and make your voice heard!

Pricing

View all pricing
N/A
Unavailable

What is Apache Pig?

Apache Pig is a programming tool for creating MapReduce programs used in Hadoop.

Entry-level set up fee?

  • No setup fee

Offerings

  • Free Trial
  • Free/Freemium Version
  • Premium Consulting / Integration Services

Would you like us to let the vendor know that you want pricing?

1 person want pricing too

Features Scorecard

No scorecards have been submitted for this product yet..

Product Details

What is Apache Pig?

Apache Pig is a programming tool for creating MapReduce programs used in Hadoop.

Apache Pig Technical Details

Operating SystemsUnspecified
Mobile ApplicationNo

Comparisons

View all alternatives

Reviews and Ratings

 (24)

Ratings

Reviews

(1-9 of 9)
Companies can't remove reviews or game the system. Here's why
Score 9 out of 10
Vetted Review
Verified User
Review Source
  • It provides great support to large datasets and ad-hoc reporting.
  • It has almost all the set of operators to perform actions such as Join, Sort, Merge, etc.
  • Anybody can use Apache Pig with some initial training and it is very much familiar with SQL.
  • It can handle almost all structured, and unstructured data.
  • Apache Pig is built using the data flows, users can easily see all the processes and information.
  • One of the most important limitations of Apache Pig is it does not support OLTP (Online Transaction Processing) as it only supports OLAP (Online Analytical Processing).
  • Apache Pig has very high latency as compared to Map Reduce.
  • Apache Pig is designed for ETL and thus not perfectly suited for real-time analysis.
  • The training materials are hard to learn and need improvements.
Sourov K Chowdhury | TrustRadius Reviewer
Score 8 out of 10
Vetted Review
Verified User
Review Source
  • Its performance, ease of use, and simplicity in learning and deployment.
  • Using this tool, we can quickly analyze large amounts of data.
  • It's adequate for map-reducing large datasets and fully abstracted MapReduce.
  • Pig's error debugging consumes most of its development time because it can be unstable and immature.
  • It is significantly more challenging to learn and master than Hive. It's a little slower than Spark.
April 07, 2022

Apache Pig

Score 7 out of 10
Vetted Review
Verified User
Review Source
  • Useful for map -reducing huge datasets
  • Easy to learn and deploy
  • Optimization is higher compared to relative products.
  • Pace of introducing new features is very slow.
  • Community is also relatively small because it is still in early stage.
  • Debug functionality is not there, also it is compile time
Jordan Moore | TrustRadius Reviewer
Score 8 out of 10
Vetted Review
Verified User
Review Source
  • Iterative Development - you can write aliases/variables, which are not immediately executed and these are stored in a DAG, which is only evaluated upon dumping or storing another alias.
  • Fast execution - Works with MapReduce, Tez, or Spark execution frameworks to provide fast run times at large scales.
  • Local and remote interoperability - Scripts that depend on testing a small dataset locally before moving to the full thing can simply be done with "pig -x local."
  • General syntax for the FOREACH ... GENERATE feature is confusing for nested actions.
  • The docs are hard to navigate, but it is made up for by reasonable examples.
  • A version less than 1.0 doesn't instill confidence in the product that has been around for over half a decade (as of writing).
Subhadipto Poddar | TrustRadius Reviewer
Score 8 out of 10
Vetted Review
Verified User
Review Source
  • Fast
  • Easy to implement
  • Can process data of almost any size
  • Easy to learn schema
  • It can only work on trivial arithmetic problems.
  • No or very difficult provision of looping across data
  • Sequential checks are almost impossible to implement
Kartik Chavan | TrustRadius Reviewer
Score 7 out of 10
Vetted Review
Verified User
Review Source
  • Long logics in Java? Apache Pig is a good alternative.
  • Has a lot of great features including table joins on many databases like DBMS, Hive, Spark-SQL etc.
  • Faster & easy development compared to regular map-reduce jobs.
  • UDFS Python errors are not interpretable. Developer struggles for a very very long time if he/she gets these errors.
  • Being in early stage, it still has a small community for help in related matters.
  • It needs a lot of improvements yet. Only recently they added datetime module for time series, which is a very basic requirement.
Score 7 out of 10
Vetted Review
Verified User
Review Source
  • Provides a decent abstraction for Map-Reduce jobs, allowing for a faster result than creating your own MR jobs
  • Good documentation and resources for learning Pig Latin (the Domain Specific Language of the Apache Pig platform)
  • Large community allows for easy learning, support, and feature improvements/updates
  • May not fit every need and a SQL-like abstraction may be more effective for some tasks (look at Spark-SQL, Hive, or even an actual DBMS)
  • All Pig jobs are written in a Domain Specific Language so not a lot of transferable knowledge
  • Writing your own User Defined Functions (UDFS) is a nice feature but can be painful to implement in practice
Score 8 out of 10
Vetted Review
Verified User
Review Source
  • Apache pig DSL provides a better alternative to Java map reduce code and the instruction set is very easy to learn and master.
  • It has many advanced features built-in such as joins, secondary sort, many optimizations, predicate push-down, etc.
  • When Hive was not very advanced (extremely slow) few years ago, pig has always been the go to solution. Now with Spark and Hive (after significant updates), the need to learn apache pig may be questionable.
  • Improve Spark support and compatibility
  • Spark and Hive are already being used main-stream, both of them have an instruction set that is easier to learn and master in a matter of days. While apache pig used to be a great alternative to writing java map reduce, Hive after significant updates is now either equal or better than pig.