What users are saying about

Apache Pig

18 Ratings

Apache Spark

97 Ratings

Apache Pig

18 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 7.3 out of 101

Apache Spark

97 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 8.6 out of 101

Add comparison

Likelihood to Recommend

Apache Pig

It is one great option in terms of database pipelining. It is highly effective for unstructured datasets to work with. Also, Apache Pig being a procedural language, unlike SQL, it is also easy to learn compared to other alternatives. But other alternatives like Apache Spark would be my recommendation due to the high availability of advanced libraries, which will reduce our extra efforts of writing from scratch
Kartik Chavan profile photo

Apache Spark

Apache Spark has rich APIs for regular data transformations or for ML workloads or for graph workloads, whereas other systems may not such a wide range of support. Choose it when you need to perform data transformations for big data as offline jobs, whereas use MongoDB-like distributed database systems for more realtime queries.
Nitin Pasumarthy profile photo

Pros

  • Provides a decent abstraction for Map-Reduce jobs, allowing for a faster result than creating your own MR jobs
  • Good documentation and resources for learning Pig Latin (the Domain Specific Language of the Apache Pig platform)
  • Large community allows for easy learning, support, and feature improvements/updates
No photo available
  • Machine Learning.
  • Data Analysis
  • WorkFlow process (faster than MapReduce).
  • SQL connector to multiple data sources
Anson Abraham profile photo

Cons

  • Improve Spark support and compatibility
  • Spark and Hive are already being used main-stream, both of them have an instruction set that is easier to learn and master in a matter of days. While apache pig used to be a great alternative to writing java map reduce, Hive after significant updates is now either equal or better than pig.
No photo available
  • Data visualization.
  • Waiting for Web Development for small apps to be started with Spark as backbone middleware and HDFS as data retrieval file system.
  • Transformations and actions available are limited so must modify API to work for more features.
Kamesh Emani profile photo

Usability

Apache Pig10.0
Based on 1 answer
It is quick, fast and easy to implement Apache Pig which makes is quite popular to be used.
Subhadipto Poddar profile photo
No score
No answers yet
No answers on this topic

Alternatives Considered

I use both Apache Pig and its alternatives like Apache Spark & Apache Hive. Apache Pig was one of the best options in Big Data's initial stages. But now alternatives have taken over the market, rendering Apache Pig behind in the competition. But it is still a better alternative to Map Reduce. It is also a good option for working with unstructured datasets. Moreover, in certain cases, Apache Pig is much faster than Hive & Spark.
Kartik Chavan profile photo
There are a few newer frameworks for general processing like Flink, Beam, frameworks for streaming like Samza and Storm, and traditional Map-Reduce. I think Spark is at a sweet spot where its clearly better than Map-Reduce for many workflows yet has gotten a good amount of support in the community that there is little risk in deploying it. It also integrates batch and streaming workflows and APIs, allowing an all in package for multiple use-cases.
No photo available

Return on Investment

  • Higher learning curve than other similar technologies so on-boarding new engineers or change ownership of Apache Pig code tends to be a bit of a headache
  • Once the language is learned and understood it can be relatively straightforward to write simple Pig scripts so development can go relatively quickly with a skilled team
  • As distributed technologies grow and improve, overall Apache Pig feels left in the dust and is more legacy code to support than something to actively develop with.
No photo available
  • Positive: we don't worry about scale.
  • Positive: large support community.
  • Negative: Takes time to set up, overkill for many simpler workflows.
No photo available

Pricing Details

Apache Pig

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details

Apache Spark

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details