What users are saying about
18 Ratings
219 Ratings
18 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 7.3 out of 101
219 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 8 out of 101

Add comparison

Likelihood to Recommend

Apache Pig

It is one great option in terms of database pipelining. It is highly effective for unstructured datasets to work with. Also, Apache Pig being a procedural language, unlike SQL, it is also easy to learn compared to other alternatives. But other alternatives like Apache Spark would be my recommendation due to the high availability of advanced libraries, which will reduce our extra efforts of writing from scratch
Kartik Chavan profile photo

Hadoop

Hadoop is well suited for healthcare organizations that deal with huge amounts of data and optimizing data.
No photo available

Pros

  • Apache pig DSL provides a better alternative to Java map reduce code and the instruction set is very easy to learn and master.
  • It has many advanced features built-in such as joins, secondary sort, many optimizations, predicate push-down, etc.
  • When Hive was not very advanced (extremely slow) few years ago, pig has always been the go to solution. Now with Spark and Hive (after significant updates), the need to learn apache pig may be questionable.
No photo available
  • Capability to collaborate with R Studio. Most of the statistical algorithms can be deployed.
  • Handling Big Data issues like storage, information retrieval, data manipulation, etc.
  • Redundant tasks like data wrangling, data processing, and cleaning are more efficient in Hadoop as the processing times are faster.
Kunal Sonalkar profile photo

Cons

  • UDFS Python errors are not interpretable. Developer struggles for a very very long time if he/she gets these errors.
  • Being in early stage, it still has a small community for help in related matters.
  • It needs a lot of improvements yet. Only recently they added datetime module for time series, which is a very basic requirement.
Kartik Chavan profile photo
  • Hadoop requires intensive computational platforms like a minimum of 8GB memory and i5 processor. Sometimes the hardware does become a hindrance.
  • If we can connect Hadoop to Salesforce, it would be a tremendous functionality as most CRM data comes from that channel.
  • It will be good to have some Geo Coding features if someone wants to opt for spatial data analysis using latitudes and longitudes.
Kunal Sonalkar profile photo

Likelihood to Renew

No score
No answers yet
No answers on this topic
Hadoop9.6
Based on 8 answers
Hadoop is organization-independent and can be used for various purposes ranging from archiving to reporting and can make use of economic, commodity hardware. There is also a lot of saving in terms of licensing costs - since most of the Hadoop ecosystem is available as open-source and is free
Bhushan Lakhe profile photo

Usability

Apache Pig10.0
Based on 1 answer
It is quick, fast and easy to implement Apache Pig which makes is quite popular to be used.
Subhadipto Poddar profile photo
Hadoop9.0
Based on 3 answers
I found it really useful during my academic projects. Data handling for large data sets was easy with Hadoop. It used to work really fast for bigger data sets. I found it reliable.
Tushar Kulkarni profile photo

Online Training

No score
No answers yet
No answers on this topic
Hadoop6.1
Based on 2 answers
Hadoop is a complex topic and best suited for classrom training. Online training are a waste of time and money.
Bhushan Lakhe profile photo

Alternatives Considered

I use both Apache Pig and its alternatives like Apache Spark & Apache Hive. Apache Pig was one of the best options in Big Data's initial stages. But now alternatives have taken over the market, rendering Apache Pig behind in the competition. But it is still a better alternative to Map Reduce. It is also a good option for working with unstructured datasets. Moreover, in certain cases, Apache Pig is much faster than Hive & Spark.
Kartik Chavan profile photo
Apache Spark can be considered as an alternative because of its similar capabilities around processing and storing big data. The reason we went with Hadoop was the literature available online and integration capability with platforms like R Studio. The popularity of Hadoop has helped us in debugging issues and solving problems at a faster rate.
Kunal Sonalkar profile photo

Return on Investment

  • Return on Investments are significant considering what it can do with traditional analysis techniques. But, other alternatives like Apache Spark, Hive being more efficient, it is hard to stick to Apache Pig.
  • It can handle large datasets pretty easily compared to SQL. But, again, alternatives are more efficient.
  • While working on unstructured, decentralized dataset, Pig is highly beneficial, as it is not a complete deviation from SQL, but it does not take you in complexity MapReduce as well.
Kartik Chavan profile photo
  • Positive: it is powerful, and it allows you to manage your data on a very big scale.
  • Negative: since its computationally expensive, the laptops were upgraded and that was pretty heavy on financials.
  • Positive: it also has given us the power to make data-driven decisions anytime and anywhere.
Kunal Sonalkar profile photo

Pricing Details

Apache Pig

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details

Hadoop

General
Free Trial
Free/Freemium Version
Yes
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details