What users are saying about
24 Ratings
145 Ratings
24 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow noopener'>trScore algorithm: Learn more.</a>
Score 7.9 out of 100
145 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow noopener'>trScore algorithm: Learn more.</a>
Score 8.8 out of 100

Attribute Ratings

  • Apache Pig is rated higher in 1 area: Usability
  • Apache Spark is rated higher in 2 areas: Likelihood to Recommend, Support Rating

Likelihood to Recommend

7.4

Apache Pig

74%
8 Ratings
9.2

Apache Spark

92%
22 Ratings

Likelihood to Renew

Apache Pig

N/A
0 Ratings
10.0

Apache Spark

100%
1 Rating

Usability

10.0

Apache Pig

100%
1 Rating
9.4

Apache Spark

94%
2 Ratings

Support Rating

6.0

Apache Pig

60%
2 Ratings
8.7

Apache Spark

87%
6 Ratings

Likelihood to Recommend

Apache Pig

Apache Pig is a lightweight framework that is simple to learn and put into production. It converts MapReduce tasks into SQL-like queries. It also reduces the data and performs some simple mathematical functions. Combining data is incredibly beneficial. With Apache Pig's Data Time functions, we can get quicker results. It works on 150-180 GB monthly datasets and reduces them in a few minutes. However, it cannot perform sequential operations, such as comparing consecutive lines. And another flaw of this method is that it doesn't allow loops and nested loops to span more than one variable at a time. Then again, I'd say go for it!
Sourov K Chowdhury | TrustRadius Reviewer

Apache Spark

The software appears to run more efficiently than other big data tools, such as Hadoop. Given that, Apache Spark is well-suited for querying and trying to make sense of very, very large data sets. The software offers many advanced machine learning and econometrics tools, although these tools are used only partially because very large data sets require too much time when the data sets get too large. The software is not well-suited for projects that are not big data in size. The graphics and analytical output are subpar compared to other tools.
Thomas Young | TrustRadius Reviewer

Pros

Apache Pig

  • Long logics in Java? Apache Pig is a good alternative.
  • Has a lot of great features including table joins on many databases like DBMS, Hive, Spark-SQL etc.
  • Faster & easy development compared to regular map-reduce jobs.
Kartik Chavan | TrustRadius Reviewer

Apache Spark

  • Rich APIs for data transformation making for very each to transform and prepare data in a distributed environment without worrying about memory issues
  • Faster in execution times compare to Hadoop and PIG Latin
  • Easy SQL interface to the same data set for people who are comfortable to explore data in a declarative manner
  • Interoperability between SQL and Scala / Python style of munging data
Nitin Pasumarthy | TrustRadius Reviewer

Cons

Apache Pig

  • General syntax for the FOREACH ... GENERATE feature is confusing for nested actions.
  • The docs are hard to navigate, but it is made up for by reasonable examples.
  • A version less than 1.0 doesn't instill confidence in the product that has been around for over half a decade (as of writing).
Jordan Moore | TrustRadius Reviewer

Apache Spark

  • Memory management. Very weak on that.
  • PySpark not as robust as scala with spark.
  • spark master HA is needed. Not as HA as it should be.
  • Locality should not be a necessity, but does help improvement. But would prefer no locality
Anson Abraham | TrustRadius Reviewer

Pricing Details

Apache Pig

General

Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No

Starting Price

Apache Spark

General

Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No

Starting Price

Likelihood to Renew

Apache Pig

No score
No answers yet
No answers on this topic

Apache Spark

Apache Spark 10.0
Based on 1 answer
Capacity of computing data in cluster and fast speed.
Steven Li | TrustRadius Reviewer

Usability

Apache Pig

Apache Pig 10.0
Based on 1 answer
It is quick, fast and easy to implement Apache Pig which makes is quite popular to be used.
Subhadipto Poddar | TrustRadius Reviewer

Apache Spark

Apache Spark 9.4
Based on 2 answers
The only thing I dislike about spark's usability is the learning curve, there are many actions and transformations, however, its wide-range of uses for ETL processing, facility to integrate and it's multi-language support make this library a powerhouse for your data science solutions. It has especially aided us with its lightning-fast processing times.
Anonymous | TrustRadius Reviewer

Support Rating

Apache Pig

Apache Pig 6.0
Based on 2 answers
The documentation is adequate. I'm not sure how large of an external community there is for support.
Jordan Moore | TrustRadius Reviewer

Apache Spark

Apache Spark 8.7
Based on 6 answers
1. It integrates very well with scala or python.2. It's very easy to understand SQL interoperability.3. Apache is way faster than the other competitive technologies.4. The support from the Apache community is very huge for Spark.5. Execution times are faster as compared to others.6. There are a large number of forums available for Apache Spark.7. The code availability for Apache Spark is simpler and easy to gain access to.8. Many organizations use Apache Spark, so many solutions are available for existing applications.
Yogesh Mhasde | TrustRadius Reviewer

Alternatives Considered

Apache Pig

Apache Pig might help to start things faster at first and it was one of the best tool years back but it lacks important features that are needed in the data engineering world right now. Pig also has a steeper learning curve since it uses a proprietary language compared to Spark which can be coded with Python, Java.
Anonymous | TrustRadius Reviewer

Apache Spark

Spark in comparison to similar technologies ends up being a one stop shop. You can achieve so much with this one framework instead of having to stitch and weave multiple technologies from the Hadoop stack, all while getting incredibility performance, minimal boilerplate, and getting the ability to write your application in the language of your choosing.
Anonymous | TrustRadius Reviewer

Return on Investment

Apache Pig

  • Higher learning curve than other similar technologies so on-boarding new engineers or change ownership of Apache Pig code tends to be a bit of a headache
  • Once the language is learned and understood it can be relatively straightforward to write simple Pig scripts so development can go relatively quickly with a skilled team
  • As distributed technologies grow and improve, overall Apache Pig feels left in the dust and is more legacy code to support than something to actively develop with.
Anonymous | TrustRadius Reviewer

Apache Spark

  • Business leaders are able to take data driven decisions
  • Business users are able access to data in near real time now . Before using spark, they had to wait for at least 24 hours for data to be available
  • Business is able come up with new product ideas
Surendranatha Reddy Chappidi | TrustRadius Reviewer

Add comparison