What users are saying about
18 Ratings
103 Ratings
18 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 7.3 out of 101
103 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 8.5 out of 101

Add comparison

Likelihood to Recommend

Apache Pig

It is one great option in terms of database pipelining. It is highly effective for unstructured datasets to work with. Also, Apache Pig being a procedural language, unlike SQL, it is also easy to learn compared to other alternatives. But other alternatives like Apache Spark would be my recommendation due to the high availability of advanced libraries, which will reduce our extra efforts of writing from scratch
Kartik Chavan profile photo

Apache Spark

The software appears to run more efficiently than other big data tools, such as Hadoop. Given that, Apache Spark is well-suited for querying and trying to make sense of very, very large data sets. The software offers many advanced machine learning and econometrics tools, although these tools are used only partially because very large data sets require too much time when the data sets get too large. The software is not well-suited for projects that are not big data in size. The graphics and analytical output are subpar compared to other tools.
Thomas Young profile photo

Pros

  • Apache pig DSL provides a better alternative to Java map reduce code and the instruction set is very easy to learn and master.
  • It has many advanced features built-in such as joins, secondary sort, many optimizations, predicate push-down, etc.
  • When Hive was not very advanced (extremely slow) few years ago, pig has always been the go to solution. Now with Spark and Hive (after significant updates), the need to learn apache pig may be questionable.
No photo available
  • Apache Spark makes processing very large data sets possible. It handles these data sets in a fairly quick manner.
  • Apache Spark does a fairly good job implementing machine learning models for larger data sets.
  • Apache Spark seems to be a rapidly advancing software, with the new features making the software ever more straight-forward to use.
Thomas Young profile photo

Cons

  • Improve Spark support and compatibility
  • Spark and Hive are already being used main-stream, both of them have an instruction set that is easier to learn and master in a matter of days. While apache pig used to be a great alternative to writing java map reduce, Hive after significant updates is now either equal or better than pig.
No photo available
  • Resource heavy, jobs, in general, can be very memory intensive and you will want the nodes in your cluster to reflect that.
  • Debugging, it has gotten better with every release but sometimes it can be difficult to debug an error due to ambiguous or misleading exceptions and stack traces.
No photo available

Usability

Apache Pig10.0
Based on 1 answer
It is quick, fast and easy to implement Apache Pig which makes is quite popular to be used.
Subhadipto Poddar profile photo
No score
No answers yet
No answers on this topic

Alternatives Considered

I use both Apache Pig and its alternatives like Apache Spark & Apache Hive. Apache Pig was one of the best options in Big Data's initial stages. But now alternatives have taken over the market, rendering Apache Pig behind in the competition. But it is still a better alternative to Map Reduce. It is also a good option for working with unstructured datasets. Moreover, in certain cases, Apache Pig is much faster than Hive & Spark.
Kartik Chavan profile photo
I prefer Apache Spark compared to Hadoop, since in my experience Spark has more usability and comes equipped with simple APIs for Scala, Python, Java and Spark SQL, as well as provides feedback in REPL format on the commands. At the same time, Apache Spark seems to have the best performance in the processing of large data that works in memory and, therefore, more processes can be downloaded on Spark than on Hadoop, despite the fact that Hadoop is also a very useful tool.
Carla Borges profile photo

Return on Investment

  • Return on Investments are significant considering what it can do with traditional analysis techniques. But, other alternatives like Apache Spark, Hive being more efficient, it is hard to stick to Apache Pig.
  • It can handle large datasets pretty easily compared to SQL. But, again, alternatives are more efficient.
  • While working on unstructured, decentralized dataset, Pig is highly beneficial, as it is not a complete deviation from SQL, but it does not take you in complexity MapReduce as well.
Kartik Chavan profile photo
  • Apache Spark has faster performance compared to MapReduce.
  • Combination of Python & Spark is the best. Shorter code, faster and efficient performance.
  • Can replace RDBMS
Kartik Chavan profile photo

Pricing Details

Apache Pig

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details

Apache Spark

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details